The Math Forum

Ask Dr. Math - Questions and Answers from our Archives
Associated Topics || Dr. Math Home || Search Dr. Math

Compressing Numbers and the Shannon Limit

Date: 05/12/2004 at 07:26:00
From: James
Subject: Convert a big number to a small formula

I need to find a formula that takes X (a large number) and creates 
another expression which equals X but is shorter in length than X.

For example, take 390,625 which uses six characters.  I need a formula 
which calculates something like 5^8, which equals 390,625 but is only
three characters instead of six.

I don't mind if the end formula uses exponents, all I want is to take
a BIG number (100's of digits) and convert it to a small expression.

Date: 05/12/2004 at 09:44:10
From: Doctor Vogler
Subject: Re: Convert a big number to a small formula

Hi James,

Thanks for writing to Dr Math.  There are a few issues here that are
important to your question.  To illustrate them, I will start with a
technique used by scientists the world over to accomplish exactly
this.  It's called scientific notation.  You take a big number like


and you pull out powers of ten until there is only one digit to the
left of the decimal, which makes the number look like

  2.53 * 10^15.

Now, you may notice something here:  I had a lot of zeros.  What would
I do if my number were actually

  2,531,657,291,387,092 ?

Well, in scientific notation, the number wouldn't get much smaller IF
I write out the number exactly, which would be

  2.531657291387092 * 10^15.

Therefore, scientists don't.  Furthermore, they usually can't measure
things to 15 digits of accuracy, so the smaller digits aren't really
relevant anyway.  So a scientist would probably keep only as many
digits as he needs, and write it as something like

  2.53 * 10^15,

which is close enough for his purposes.

That said, mathematicians are different from scientists in that they
often need exact numbers, and rounding won't do at all.  This might be
what you want (if it's not, then I'd recommend scientific notation). 
Unfortunately for you, there is not, in general, a way to compress any
large number in an exact way, and I can tell you why.

I said "compress" because that is what you want to do.  Your example
was a number carefully chosen so that it can be represented in a
simple way, with a small formula.  Consider the quantity of numbers of
the form a^b, where a and b are only one digit each.  There are only
83 such numbers, even though some of them are up in the millions. 
That means that *very few* numbers can be represented in this way.  In
fact, this concept can be generalized.  Suppose you want your formula
to fit in n digits (or n bits of computer memory, or some such thing).
You can say exactly how many different numbers you could possibly
represent with that many digits (or bits), and that is your limit.

And this brings us to a thing called "The Shannon Limit."  Shannon was 
a man (that's his last name) who studied information processing before 
computers were around, before information processing was a 
well-studied science.  One of the things he proved is that there is a
maximum amount of information that can be transmitted in a fixed
number of bits.  (His theory also treats when some of your bits might
be a little fuzzy, meaning you might lose a few bits and not be sure
which ones.  That gets into coding theory, used in modems and such. 
You probably won't need to know that, but if you study the Shannon
Limit, then you will run across it.)

Basically, the idea is that there is a limit to the amount you can
compress data without losing any data.  Sometimes repetitions or
patterns allow data to be compressed some (that's what pkzip and gif
and other software programs do) and sometimes you can deliberately
lose some data in order to compress more (that's what jpeg does, and
that's what scientific notation does), but if you don't want to lose
any data, then you quickly approach a limit where you can't get your
data any smaller.

What kinds of patterns?  Well, 5^5 is a pattern; it is a number that
has a lot of repeating prime factors.  So is 2^57 - 1.  It is one less
than a number with a lot of prime factors.  So is a number that ends
in a lot of zeros, like 2.53 * 10^15.  But if you want this to work on
just any number, then Shannon's Limit says it won't work.  If you pick
a random number with 100 digits, which is not likely to have many
patterns, then you'll probably need about 100 digits to represent it
exactly, no matter how you try to compress it.

Finally, if you doing anything *other* than picking a very large 
random number, then it is very likely that the number is special in
some way, such as being a power of a small prime, or being one less
than a power of 2, and the way you got the number may give you a way
to represent it with a small formula.  That's why mathematicians
usually don't write out very large numbers but just leave them in the
form they started in.  For example, Mersenne primes are reported in
the form:

  2^20,996,011 - 1,

which is a lot smaller than writing out all six million digits.

Does all of this discussion clear things up for you?  Is this what you
needed?  If you have any questions about this or need more help, 
please write back, and I will try to offer further suggestions.

- Doctor Vogler, The Math Forum 
Associated Topics:
College Number Theory
High School Number Theory

Search the Dr. Math Library:

Find items containing (put spaces between keywords):
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

Math Forum Home || Math Library || Quick Reference || Math Forum Search

Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.