
Re: problem on recordbreaking values in probability
Posted:
Mar 10, 2013 8:52 PM


On 03/01/2013 08:41 AM, David Bernier wrote: > On 02/27/2013 10:24 PM, David Bernier wrote: >> On 02/27/2013 04:05 PM, James Waldby wrote: >>> On Wed, 27 Feb 2013 07:10:08 0500, David Bernier wrote: >>>> On 02/27/2013 05:49 AM, David Bernier wrote: >>>>> On 02/27/2013 05:31 AM, David Bernier wrote: >>>>>> I used Marsaglia's 64bit SUPER KISS pseudorandom number generator >>>>>> to simulate uniform r.v.s on [0, 1] that are independent, as >>>>>> X_1, X_2, X_3, ad infinitum >>>>>> >>>>>> For each go, (or sequence) I define its 1st recordbreaking value >>>>>> as R(1) as X_1, its 2nd recordbreaking value R(2) as the >>>>>> value taken by X_n for the smallest n with X_n > X_1, and in general >>> [ R(k+1) is the value taken by X_n for the smallest n with X_n > R(k)] >>> ... >>>>>> In my first simulation I get: R(20) = 0.999999999945556 >>>>>> or about 5.4E11 less than 1 , a one in 18 billion event. >>> ... >>>>>> In fact, R(20) is about 1  (0.307)^20 ... >>>> >>>> I finally got another 20th recordbreaking value, on my >>>> second sequence, but it took much longer. The values >>>> 1  R(20), one per sequence, are what I call "pvalues" >>>> >>>> from the simulations so far, a lot of variance in orders >>>> of magnitude: >>>> >>>> the corresponding pvalue is 0.000000000054 // seq. 1 >>>> the corresponding pvalue is 0.000000000001 // seq. 2 >>>> the corresponding pvalue is 0.000000002463 >>>> the corresponding pvalue is 0.000000000782 >>>> the corresponding pvalue is 0.000000106993 >>>> the corresponding pvalue is 0.000000342142 >>>> the corresponding pvalue is 0.000000001978 >>> [etc] >>> >>> It would be useful to report the number of trials each simulation >>> took to find its 20th RBV. If a simulation takes m trials, the >>> variance of the value X_m is approximately 1/(m^2), where X_m is >>> the mth smallest (ie the largest) number among m trials. (See >>> <http://en.wikipedia.org/wiki/Uniform_distribution_%28continuous%29#Order_statistics>) >>> >>> >>> Your simulation data implies there is a wide variance among the >>> values of m required to find a 20th RBV. >>> >>> In following, let L(n) = Pr(n'th item of n is lowest). (Distribution >>> of the lowest item should be similar to distribution of 1(highest >>> item).) I suppose that L(n) = 1/n and that the expected value of the >>> number of recordlowvalues (RLV's) in m trials is sum{i=1 to m}(1/i), >>> or about H_m, the m'th harmonic number, which can be approximated by >>> log(m) + gamma, with gamma = EulerMascheroni constant, about 0.5772. >>> When H_m ~ 20, log(m) ~ 19.42 and m is about 272 million. That is, >>> simulations to find the 20th RLV or RBV will take about 272 million >>> steps, on average. (I don't know which kind of average applies, or >>> what the variance might be.) >>> >> >> Yes, I hadn't thought of using expectations and expected values, >> leading to partial sums of the harmonic series. >> >> You wrote: >> "simulations to find the 20th RLV or RBV will take about 272 million >> steps". >> >> Right, that looks like an interesting way of looking at >> record values. >> >> Since it's like a stopping time, the number of the simulation, >> I suppose it could be denoted S({X_i}_{i=1, ... oo}, 20) >> or S(X_{BOLD_overligned}, 20). >> >> Electricity should go off tomorrow morning for repair. >> >> I think your estimate >> E(log(S_20)) ~= 20  gamma >> should be very good. > > In the literature, a remarkable article, which may have > appeared in the Am. Math. Monthly, can be found by > searching for: > Breaking Records and Breaking Boards. Ned Glick > > The author is Ned Glick, at some time at UBC in Canada. > > David Bernier
I did long simulations for 12th RecordBreaking Values.
With MatLab, I constructed a histogram of the natural logarithms of the 76,000 values:
< http://img521.imageshack.us/img521/7702/records12log.jpg > .
The mean should be close to 12  EulerGamma: http://en.wikipedia.org/wiki/Euler%E2%80%93Mascheroni_constant
David Bernier  $apr1$LJgyupye$GZQc9jyvrdP50vW77sYvz1

