Compressing Numbers and the Shannon Limit
Date: 05/12/2004 at 07:26:00 From: James Subject: Convert a big number to a small formula I need to find a formula that takes X (a large number) and creates another expression which equals X but is shorter in length than X. For example, take 390,625 which uses six characters. I need a formula which calculates something like 5^8, which equals 390,625 but is only three characters instead of six. I don't mind if the end formula uses exponents, all I want is to take a BIG number (100's of digits) and convert it to a small expression.
Date: 05/12/2004 at 09:44:10 From: Doctor Vogler Subject: Re: Convert a big number to a small formula Hi James, Thanks for writing to Dr Math. There are a few issues here that are important to your question. To illustrate them, I will start with a technique used by scientists the world over to accomplish exactly this. It's called scientific notation. You take a big number like 2,530,000,000,000,000 and you pull out powers of ten until there is only one digit to the left of the decimal, which makes the number look like 2.53 * 10^15. Now, you may notice something here: I had a lot of zeros. What would I do if my number were actually 2,531,657,291,387,092 ? Well, in scientific notation, the number wouldn't get much smaller IF I write out the number exactly, which would be 2.531657291387092 * 10^15. Therefore, scientists don't. Furthermore, they usually can't measure things to 15 digits of accuracy, so the smaller digits aren't really relevant anyway. So a scientist would probably keep only as many digits as he needs, and write it as something like 2.53 * 10^15, which is close enough for his purposes. That said, mathematicians are different from scientists in that they often need exact numbers, and rounding won't do at all. This might be what you want (if it's not, then I'd recommend scientific notation). Unfortunately for you, there is not, in general, a way to compress any large number in an exact way, and I can tell you why. I said "compress" because that is what you want to do. Your example was a number carefully chosen so that it can be represented in a simple way, with a small formula. Consider the quantity of numbers of the form a^b, where a and b are only one digit each. There are only 83 such numbers, even though some of them are up in the millions. That means that *very few* numbers can be represented in this way. In fact, this concept can be generalized. Suppose you want your formula to fit in n digits (or n bits of computer memory, or some such thing). You can say exactly how many different numbers you could possibly represent with that many digits (or bits), and that is your limit. And this brings us to a thing called "The Shannon Limit." Shannon was a man (that's his last name) who studied information processing before computers were around, before information processing was a well-studied science. One of the things he proved is that there is a maximum amount of information that can be transmitted in a fixed number of bits. (His theory also treats when some of your bits might be a little fuzzy, meaning you might lose a few bits and not be sure which ones. That gets into coding theory, used in modems and such. You probably won't need to know that, but if you study the Shannon Limit, then you will run across it.) Basically, the idea is that there is a limit to the amount you can compress data without losing any data. Sometimes repetitions or patterns allow data to be compressed some (that's what pkzip and gif and other software programs do) and sometimes you can deliberately lose some data in order to compress more (that's what jpeg does, and that's what scientific notation does), but if you don't want to lose any data, then you quickly approach a limit where you can't get your data any smaller. What kinds of patterns? Well, 5^5 is a pattern; it is a number that has a lot of repeating prime factors. So is 2^57 - 1. It is one less than a number with a lot of prime factors. So is a number that ends in a lot of zeros, like 2.53 * 10^15. But if you want this to work on just any number, then Shannon's Limit says it won't work. If you pick a random number with 100 digits, which is not likely to have many patterns, then you'll probably need about 100 digits to represent it exactly, no matter how you try to compress it. Finally, if you doing anything *other* than picking a very large random number, then it is very likely that the number is special in some way, such as being a power of a small prime, or being one less than a power of 2, and the way you got the number may give you a way to represent it with a small formula. That's why mathematicians usually don't write out very large numbers but just leave them in the form they started in. For example, Mersenne primes are reported in the form: 2^20,996,011 - 1, which is a lot smaller than writing out all six million digits. Does all of this discussion clear things up for you? Is this what you needed? If you have any questions about this or need more help, please write back, and I will try to offer further suggestions. - Doctor Vogler, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.