geo
Posts:
38
Registered:
6/19/08


RNGs with periods exceeding 10^(40million).
Posted:
Jan 16, 2011 1:51 PM


I offer here a MWC (MultiplyWithCarry) RNG whose period I cannot determinenor is it likely that anyone else canbut that unknown period is almost certainly greater than 10^40000000, i.e, 10^(40million). Versions for either 32 and 64bit integers are given, as well as suggestions for similar versions with periods as great or even greater magnitude.
MWC RNGs are ordinarily based on primes of the form p=ab^r1, with b=2^32 or 2^64 and a<b. They generate, 32 or 64bits at a time, and in reverse order, the binary expansions of rationals j/p, 0<j<p and the period of that binary expansion is the order of 2 mod p. The key part of the MWC process is: given 32bit a,x,c, extract the top and bottom 32 bits of a 64 bit a*x+c, or, for given 64bit a,x,c, extract the top and bottom 64 bits of a 128bit a*x+c.
I have put various computers to work establishing the order of 2 for MWC primes of the form ab^r1, b=2^32, or CMWC (ComplimentaryMultiplyWithCarry), p=ab^r+1. For example, it took 24 days on an early laptop to establish that the order of 2 for p=54767*2^1337287+1 is (p1)/2^3, and just last week I put a Samsung 64bit laptop to finding the order of 2 for the 11th largest known prime, p=27653*2^9167433+1; in CMWC form, p=14158336*b^286482+1. I expect it may take months.
Such searches are motivated by more than a Mt. Everest syndrome, as extremelylongperiod RNGs not only seem to perform very well on tests of randomness, but have uses for cryptography. For example, suppose, from a random set of seeds, you have observed 100000 successive 32bit numbers from a CMWC RNG based on p=14158336*b^286482+1. You will then be somewhere in an immense loop that has over 10^(2million) appearances of exactly that same sequence of 100000, leaving you virtually no chance of finding your particular location and thus able to predict forthcoming values.
Of course, if you can observe 286482 successive values, then, after a little modular arithmetic to determine the carry, you are OK. It is ensuring a large exponent for b that makes the RNG desirablethe larger the better.
But in seeking large exponents for b and record periods, why bother with an Everest when there are ranges with peaks far beyond surveyor's skills but certain to be several times as tall?
Such ranges come from considering, for very large r, composites of the form m=ab^r1, rather than primes of that form. The example for this posting: the composite m=(2^281)*2^(2^27)1=(2^281)*b^(2^22)1. m has no prime divisors less than 21000000, and order(2,m) is the lcm of the orders mod the prime divisors of m. We cannotand may never be able tofactor such an m>10^40403570, but for primes q, the average value of order(2,q)/q is around .572. Thus, even if m has, say, 40 prime factors, since (.572)^40>10^(10), we can only expect to "lose" around ten 10's, and perhaps ten more through common factors for the lcm, providing an estimate around 10^40403550 for the order of 2 mod m.
That the order of 2 mod m is less than 10^(40million) seems extremely unlikely; we would have to "lose" over 400thousand 10's, when something in the range of three to sixty 10's seems more likely.
So users should feel confident that the periods of the following MWC RNGs far exceed 10^(40million). Based on m=(2^281)b^(2^22)1=(2^281)b^41943041 with b=2^32, m=(2^281)B^(2^21)1=(2^281)B^20917521 with B=2^64, they are simple and very fast. You would have to observe more than 134 million successive bits produced be these MWC RNGs in order to get close to cracking them.
Note that choosing a=2^i1 in m=ab^r1 makes finding the top and bottom 32 bits of a purported 64 bit a*x+c feasible using only 32bit arithmetic, or, within 64bit arithmetic, finding the top and bottom 64 bits of a purported 128 bit a*x+c.
Here are C versions using, for 32bit integers, an unsigned long array Q[41943104], and for 64bits, an unsigned long long array Q[2091752]. Try them and see if you get the same results I do.
32bit MWC version
#include <stdio.h> static unsigned long Q[4194304],carry=0;
unsigned long b32MWC(void) {unsigned long t,x; static int j=4194303; j=(j+1)&4194303; x=Q[j]; t=(x<<28)+carry; carry=(x>>4)(t<x); return (Q[j]=tx); }
#define CNG ( cng=69069*cng+13579 ) #define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) ) #define KISS ( b32MWC()+CNG+XS )
int main(void) {unsigned long int i,x,cng=123456789,xs=362436069; /* First seed Q[] with CNG+XS: */ for(i=0;i<4194304;i++) Q[i]=CNG+XS; /* Then generate 10^9 b32MWC()s */ for(i=0;i<1000000000;i++) x=b32MWC(); printf("Does x=2769813733 ?\n x=%lu\n",x); /* followed by 10^9 KISSes: */ for(i=0;i<1000000000;i++) x=KISS; printf("Does x=3545999299 ?\n x=%lu\n",x); return 0; }
 64bit MWC version 
#include <stdio.h> static unsigned long long Q[2097152],carry=0;
unsigned long long B64MWC(void) {unsigned long long t,x; static int j=2097151; j=(j+1)&2097151; x=Q[j]; t=(x<<28)+carry; carry=(x>>36)(t<x); return (Q[j]=tx); }
#define CNG ( cng=6906969069LL*cng+13579 ) #define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<43) ) #define KISS ( B64MWC()+CNG+XS )
int main(void) {unsigned long long i,x,cng=123456789987654321LL, xs=362436069362436069LL; /* First seed Q[] with CNG+XS: */ for(i=0;i<2097152;i++) Q[i]=CNG+XS; /* Then generate 10^9 B64MWC()s */ for(i=0;i<1000000000;i++) x=B64MWC(); printf("Does x=13596816608992115578 ?\n x=%llu\n",x); /* followed by 10^9 KISSes: */ for(i=0;i<1000000000;i++) x=KISS; printf("Does x=5033346742750153761 ?\n x=%llu\n",x); return 0; }

Note that I advocate the KISS, (KeepItSimpleStupid), principle. Even though the MWC RNGs perform very well on tests of randomness, combining with Congruential (CNG) and Xorshift (XS) probably provides extra insurance at the cost of a few simple instructions, (and making the resulting KISS even harder to crack).
Also note that for unsigned integers such as in Fortran, the C instruction (t<x) means: subtract 1 if t<s, and can be implemented for signed integers as t'<s', where the ' means: flip the sign bit.
Full random seeding of the carry and the Q[] array calls for more than 17 megabytes of random bits. For ordinary randomness testing, filling, as above, the Q[] array with Congruential+Xorshift may serve, but for stronger crypto uses, many more than the random 64 or 128bits that CNG and XS require are needed. Random seeding of the carry and the Q[] array amounts to producing, either 32 or 64bits at a time, the reverse order binary expansion of j/m for some 0<=j<=m. Indeed, for m=ab^r1, given the carry and the Q[] array, that j is
j=carry+a*sum(Q[i]*b^i),i=0..r1.
There are phi(m) such j's with period order(2,m). Since m has no prime factors less than 21000000, with probability close to 1 a random seeding of Q[] should produce a j that is relatively prime to m and thus produce the full periodunknown but almost certainly far in excess of 10^(40million). Even if random seeding of carry and Q[] provides a j with gcd(j,m)>1, the period is still likely to exceed 10^(40million). There are two exceptions: All Q[i]=0, carry=0 yields j=0 and the binary expansion 0/m=.00000000000000000... All Q[i]=b1=2^321 and carry=a1=(2^282) yields j=m and the binary expansion m/m=.11111111111111111...
Almost any choice of m=(2^i1)*2^(2^27)1 is likely to provide an immense order for 2 mod m. Choices i=16,18,22,28 resulted in m's that have no factors<10^7. Memory seems to be the key restriction for the resulting Q[] array. With enough memory, one might seek i's for which m=(2^i1)*2^(2^28)1 has no small factors, or even m=(2^i1)*2^(2^29)1, boosting the unknown and unknowable periods of MWC RNGs to ranges beyond 10^(40million).
George Marsaglia

