
Re: problem on recordbreaking values in probability
Posted:
Mar 1, 2013 8:41 AM


On 02/27/2013 10:24 PM, David Bernier wrote: > On 02/27/2013 04:05 PM, James Waldby wrote: >> On Wed, 27 Feb 2013 07:10:08 0500, David Bernier wrote: >>> On 02/27/2013 05:49 AM, David Bernier wrote: >>>> On 02/27/2013 05:31 AM, David Bernier wrote: >>>>> I used Marsaglia's 64bit SUPER KISS pseudorandom number generator >>>>> to simulate uniform r.v.s on [0, 1] that are independent, as >>>>> X_1, X_2, X_3, ad infinitum >>>>> >>>>> For each go, (or sequence) I define its 1st recordbreaking value >>>>> as R(1) as X_1, its 2nd recordbreaking value R(2) as the >>>>> value taken by X_n for the smallest n with X_n > X_1, and in general >> [ R(k+1) is the value taken by X_n for the smallest n with X_n > R(k)] >> ... >>>>> In my first simulation I get: R(20) = 0.999999999945556 >>>>> or about 5.4E11 less than 1 , a one in 18 billion event. >> ... >>>>> In fact, R(20) is about 1  (0.307)^20 ... >>> >>> I finally got another 20th recordbreaking value, on my >>> second sequence, but it took much longer. The values >>> 1  R(20), one per sequence, are what I call "pvalues" >>> >>> from the simulations so far, a lot of variance in orders >>> of magnitude: >>> >>> the corresponding pvalue is 0.000000000054 // seq. 1 >>> the corresponding pvalue is 0.000000000001 // seq. 2 >>> the corresponding pvalue is 0.000000002463 >>> the corresponding pvalue is 0.000000000782 >>> the corresponding pvalue is 0.000000106993 >>> the corresponding pvalue is 0.000000342142 >>> the corresponding pvalue is 0.000000001978 >> [etc] >> >> It would be useful to report the number of trials each simulation >> took to find its 20th RBV. If a simulation takes m trials, the >> variance of the value X_m is approximately 1/(m^2), where X_m is >> the mth smallest (ie the largest) number among m trials. (See >> <http://en.wikipedia.org/wiki/Uniform_distribution_%28continuous%29#Order_statistics>) >> >> Your simulation data implies there is a wide variance among the >> values of m required to find a 20th RBV. >> >> In following, let L(n) = Pr(n'th item of n is lowest). (Distribution >> of the lowest item should be similar to distribution of 1(highest >> item).) I suppose that L(n) = 1/n and that the expected value of the >> number of recordlowvalues (RLV's) in m trials is sum{i=1 to m}(1/i), >> or about H_m, the m'th harmonic number, which can be approximated by >> log(m) + gamma, with gamma = EulerMascheroni constant, about 0.5772. >> When H_m ~ 20, log(m) ~ 19.42 and m is about 272 million. That is, >> simulations to find the 20th RLV or RBV will take about 272 million >> steps, on average. (I don't know which kind of average applies, or >> what the variance might be.) >> > > Yes, I hadn't thought of using expectations and expected values, > leading to partial sums of the harmonic series. > > You wrote: > "simulations to find the 20th RLV or RBV will take about 272 million > steps". > > Right, that looks like an interesting way of looking at > record values. > > Since it's like a stopping time, the number of the simulation, > I suppose it could be denoted S({X_i}_{i=1, ... oo}, 20) > or S(X_{BOLD_overligned}, 20). > > Electricity should go off tomorrow morning for repair. > > I think your estimate > E(log(S_20)) ~= 20  gamma > should be very good.
In the literature, a remarkable article, which may have appeared in the Am. Math. Monthly, can be found by searching for: Breaking Records and Breaking Boards. Ned Glick
The author is Ned Glick, at some time at UBC in Canada.
David Bernier  dracut:/# lvm vgcfgrestore File descriptor 9 (/.console_lock) leaked on lvm invocation. Parent PID 993: sh Please specify a *single* volume group to restore.

