Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: problem on record-breaking values in probability
Replies: 14   Last Post: Apr 14, 2013 11:36 PM

 Messages: [ Previous | Next ]
 David Bernier Posts: 3,672 Registered: 12/13/04
Re: problem on record-breaking values in probability
Posted: Apr 13, 2013 11:51 PM

On 03/11/2013 12:32 PM, David Bernier wrote:
> On 03/11/2013 02:07 AM, James Waldby wrote:
>> On Sun, 10 Mar 2013 21:47:27 -0400, David Bernier wrote:
>>> On 03/10/2013 08:52 PM, David Bernier wrote:
>>>> On 03/01/2013 08:41 AM, David Bernier wrote:
>>>>> On 02/27/2013 10:24 PM, David Bernier wrote:
>>>>>> On 02/27/2013 04:05 PM, James Waldby wrote:
>>>>>>> On Wed, 27 Feb 2013 07:10:08 -0500, David Bernier wrote:
>>>>>>>> On 02/27/2013 05:49 AM, David Bernier wrote:
>>>>>>>>> On 02/27/2013 05:31 AM, David Bernier wrote:
>>>>>>>>>> I used Marsaglia's 64-bit SUPER KISS pseudo-random number
>>>>>>>>>> generator
>>>>>>>>>> to simulate uniform r.v.s on [0, 1] that are independent, as
>>>>>>>>>> X_1, X_2, X_3, ad infinitum
>>>>>>>>>>
>>>>>>>>>> For each go, (or sequence) I define its 1st record-breaking value
>>>>>>>>>> as R(1) as X_1, its 2nd record-breaking value R(2) as the
>>>>>>>>>> value taken by X_n for the smallest n with X_n > X_1, and in
>>>>>>>>>> general

>>>>>>> [ R(k+1) is the value taken by X_n for the smallest n with X_n >
>>>>>>> R(k)]
>>>>>>> ...

>> [snip]
>>>>>>> [etc]
>>>>>>>
>>>>>>> It would be useful to report the number of trials each simulation
>>>>>>> took to find its 20th RBV. If a simulation takes m trials, the

>> [snip]
>>>>>>> In following, let L(n) = Pr(n'th item of n is lowest).
>>>>>>> (Distribution
>>>>>>> of the lowest item should be similar to distribution of 1-(highest
>>>>>>> item).) I suppose that L(n) = 1/n and that the expected value of
>>>>>>> the
>>>>>>> number of record-low-values (RLV's) in m trials is sum{i=1 to
>>>>>>> m}(1/i),
>>>>>>> or about H_m, the m'th harmonic number, which can be approximated by
>>>>>>> log(m) + gamma, with gamma = Euler-Mascheroni constant, about
>>>>>>> 0.5772.

>> [snip]
>>>>> In the literature, a remarkable article, which may have
>>>>> appeared in the Am. Math. Monthly, can be found by
>>>>> searching for:
>>>>> Breaking Records and Breaking Boards. Ned Glick

>> ...
>>>> I did long simulations for 12th Record-Breaking Values.
>>>>
>>>> With MatLab, I constructed a histogram of the natural
>>>> logarithms of the 76,000 values:
>>>>
>>>> < http://img521.imageshack.us/img521/7702/records12log.jpg > .

>> ...
>>> S_12 is number of trials (steps) taken to find the 12th
>>> Record-Breaking Value. On Average, log(S_12) is close
>>> to 12 - gamma (gamma is the Euler-Mascheroni constant).
>>>
>>> A number of 76,000 sequences were generated, each being
>>> continued until the 12th Record-Breaking Value for
>>> that sequence was found. There is such variance from
>>> one sample S_12 to another that I prefer the
>>> quantities log(S_12) , for the histograms.
>>>
>>> Occasionally, an unusually high record is attained
>>> in the 1st, 2nd, ... or 11th Record-Breaking Value.
>>> That makes breaking the record all the more difficult.
>>> In the simulations, the computer would pass (say) three
>>> hours or more on the same sequence, with no new output
>>> to the file for three or more hours.

>>
>> You may already have done so, but if not and if you are going to
>> run more simulations, consider (a) profiling the code, and
>> (b) trying different compilation options. (a) allows you to find
>> out which lines of code use most of the hours of CPU time, so you
>> can try alternate ways of coding them. Under (b), starting with
>> the same random seed in each case, try optimization options -O1,
>> -O2, -O3, -Ofast, timing the execution and also verifying the
>> same results. From my interpretation of URL below, those 4
>> optimization options are all you need to try.
>> <http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html>
>>

>
> Maybe I've spent enough computer time on this distribution
>
> I think I found something known about the limiting distribution
> of the k'th record-breaking trial number from the article of
> Ned Glick:
>
> From Ned Glick's (UBC) 1978 article:
>
> N_r is the r'th record-breaking time (serial number
> of the trial). N_1 = 1 always.
>
> "Also Renyi [30] stated a "central limit theorem" for the random
> variables N_r: as r --> oo, the
> distribution of (ln(N_r) - r)/sqrt(r) is
> asymptotically normal with mean = 0 and variance= 1."
> Ned Glick, "Breaking Records and Breaking Boards", 1978
>
>
>
> In Renyi's article,
>
> (log(nu_k) - k)/sqrt(k) ~ N(0,1) as k -> oo
>
> "Theorie des elements saillants d'une suite d'observations",
> 1962.

[...]

Applications to Probability and Statistics"
edited by N. Balakrishnan,

Part II, Applications to Probability Problems,
Chapter 13, "Stirling Numbers and Records",
pp. 189-201 approximately.

In the "classical scheme" of records,
alpha_1 = alpha_2 = ... alpha_n .

Then Prob[xi_n = 1] = 1/n.

xi_n is the indicator function for the n'th r.v.
(n = 1, 2, 3, ... ).
xi_n = 1 if X_n is a record, 0 otherwise.

X_1, X_2, ... are i.i.d. uniform [0, 1] r.v.s (say).

Formulas (13.19) and (13.19) give formulas
for probability distributions related to
record values.

I think using Generalized Stirling numbers
makes it harder to understand.

But "Stirling Numbers and Records" is memorable.

David Bernier

--
Jesus is an Anarchist. -- J.R.

Date Subject Author
2/27/13 David Bernier
2/27/13 David Bernier
2/27/13 David Bernier
2/27/13 James Waldby
2/27/13 David Bernier
3/1/13 David Bernier
3/1/13 David Bernier
3/10/13 David Bernier
3/10/13 David Bernier
3/11/13 James Waldby
3/11/13 David Bernier
4/13/13 David Bernier
4/14/13 David Bernier
4/14/13 David Petry