Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: problem on record-breaking values in probability
Replies: 14   Last Post: Apr 14, 2013 11:36 PM

 Messages: [ Previous | Next ]
 David Bernier Posts: 3,829 Registered: 12/13/04
Re: problem on record-breaking values in probability
Posted: Mar 10, 2013 8:52 PM

On 03/01/2013 08:41 AM, David Bernier wrote:
> On 02/27/2013 10:24 PM, David Bernier wrote:
>> On 02/27/2013 04:05 PM, James Waldby wrote:
>>> On Wed, 27 Feb 2013 07:10:08 -0500, David Bernier wrote:
>>>> On 02/27/2013 05:49 AM, David Bernier wrote:
>>>>> On 02/27/2013 05:31 AM, David Bernier wrote:
>>>>>> I used Marsaglia's 64-bit SUPER KISS pseudo-random number generator
>>>>>> to simulate uniform r.v.s on [0, 1] that are independent, as
>>>>>> X_1, X_2, X_3, ad infinitum
>>>>>>
>>>>>> For each go, (or sequence) I define its 1st record-breaking value
>>>>>> as R(1) as X_1, its 2nd record-breaking value R(2) as the
>>>>>> value taken by X_n for the smallest n with X_n > X_1, and in general

>>> [ R(k+1) is the value taken by X_n for the smallest n with X_n > R(k)]
>>> ...

>>>>>> In my first simulation I get: R(20) = 0.999999999945556
>>>>>> or about 5.4E-11 less than 1 , a one in 18 billion event.

>>> ...
>>>>>> In fact, R(20) is about 1 - (0.307)^20 ...
>>>>
>>>> I finally got another 20th record-breaking value, on my
>>>> second sequence, but it took much longer. The values
>>>> 1 - R(20), one per sequence, are what I call "p-values"
>>>>
>>>> from the simulations so far, a lot of variance in orders
>>>> of magnitude:
>>>>
>>>> the corresponding p-value is 0.000000000054 // seq. 1
>>>> the corresponding p-value is 0.000000000001 // seq. 2
>>>> the corresponding p-value is 0.000000002463
>>>> the corresponding p-value is 0.000000000782
>>>> the corresponding p-value is 0.000000106993
>>>> the corresponding p-value is 0.000000342142
>>>> the corresponding p-value is 0.000000001978

>>> [etc]
>>>
>>> It would be useful to report the number of trials each simulation
>>> took to find its 20th RBV. If a simulation takes m trials, the
>>> variance of the value X_m is approximately 1/(m^2), where X_m is
>>> the mth smallest (ie the largest) number among m trials. (See
>>> <http://en.wikipedia.org/wiki/Uniform_distribution_%28continuous%29#Order_statistics>)
>>>
>>>
>>> Your simulation data implies there is a wide variance among the
>>> values of m required to find a 20th RBV.
>>>
>>> In following, let L(n) = Pr(n'th item of n is lowest). (Distribution
>>> of the lowest item should be similar to distribution of 1-(highest
>>> item).) I suppose that L(n) = 1/n and that the expected value of the
>>> number of record-low-values (RLV's) in m trials is sum{i=1 to m}(1/i),
>>> or about H_m, the m'th harmonic number, which can be approximated by
>>> log(m) + gamma, with gamma = Euler-Mascheroni constant, about 0.5772.
>>> When H_m ~ 20, log(m) ~ 19.42 and m is about 272 million. That is,
>>> simulations to find the 20th RLV or RBV will take about 272 million
>>> steps, on average. (I don't know which kind of average applies, or
>>> what the variance might be.)
>>>

>>
>> Yes, I hadn't thought of using expectations and expected values,
>> leading to partial sums of the harmonic series.
>>
>> You wrote:
>> "simulations to find the 20th RLV or RBV will take about 272 million
>> steps".
>>
>> Right, that looks like an interesting way of looking at
>> record values.
>>
>> Since it's like a stopping time, the number of the simulation,
>> I suppose it could be denoted S({X_i}_{i=1, ... oo}, 20)
>> or S(X_{BOLD_overligned}, 20).
>>
>> Electricity should go off tomorrow morning for repair.
>>
>> E(log(S_20)) ~= 20 - gamma
>> should be very good.

>
> In the literature, a remarkable article, which may have
> appeared in the Am. Math. Monthly, can be found by
> searching for:
> Breaking Records and Breaking Boards. Ned Glick
>
> The author is Ned Glick, at some time at UBC in Canada.
>
> David Bernier

I did long simulations for 12th Record-Breaking Values.

With MatLab, I constructed a histogram of the natural
logarithms of the 76,000 values:

< http://img521.imageshack.us/img521/7702/records12log.jpg > .

The mean should be close to 12 - EulerGamma:
http://en.wikipedia.org/wiki/Euler%E2%80%93Mascheroni_constant

David Bernier
--
\$apr1\$LJgyupye\$GZQc9jyvrdP50vW77sYvz1

Date Subject Author
2/27/13 David Bernier
2/27/13 David Bernier
2/27/13 David Bernier
2/27/13 James Waldby
2/27/13 David Bernier
3/1/13 David Bernier
3/1/13 David Bernier
3/10/13 David Bernier
3/10/13 David Bernier
3/11/13 James Waldby
3/11/13 David Bernier
4/13/13 David Bernier
4/14/13 David Bernier
4/14/13 David Petry