Date: Mar 11, 2013 12:32 PM Author: David Bernier Subject: Re: problem on record-breaking values in probability On 03/11/2013 02:07 AM, James Waldby wrote:

> On Sun, 10 Mar 2013 21:47:27 -0400, David Bernier wrote:

>> On 03/10/2013 08:52 PM, David Bernier wrote:

>>> On 03/01/2013 08:41 AM, David Bernier wrote:

>>>> On 02/27/2013 10:24 PM, David Bernier wrote:

>>>>> On 02/27/2013 04:05 PM, James Waldby wrote:

>>>>>> On Wed, 27 Feb 2013 07:10:08 -0500, David Bernier wrote:

>>>>>>> On 02/27/2013 05:49 AM, David Bernier wrote:

>>>>>>>> On 02/27/2013 05:31 AM, David Bernier wrote:

>>>>>>>>> I used Marsaglia's 64-bit SUPER KISS pseudo-random number generator

>>>>>>>>> to simulate uniform r.v.s on [0, 1] that are independent, as

>>>>>>>>> X_1, X_2, X_3, ad infinitum

>>>>>>>>>

>>>>>>>>> For each go, (or sequence) I define its 1st record-breaking value

>>>>>>>>> as R(1) as X_1, its 2nd record-breaking value R(2) as the

>>>>>>>>> value taken by X_n for the smallest n with X_n > X_1, and in general

>>>>>> [ R(k+1) is the value taken by X_n for the smallest n with X_n > R(k)]

>>>>>> ...

> [snip]

>>>>>> [etc]

>>>>>>

>>>>>> It would be useful to report the number of trials each simulation

>>>>>> took to find its 20th RBV. If a simulation takes m trials, the

> [snip]

>>>>>> In following, let L(n) = Pr(n'th item of n is lowest). (Distribution

>>>>>> of the lowest item should be similar to distribution of 1-(highest

>>>>>> item).) I suppose that L(n) = 1/n and that the expected value of the

>>>>>> number of record-low-values (RLV's) in m trials is sum{i=1 to m}(1/i),

>>>>>> or about H_m, the m'th harmonic number, which can be approximated by

>>>>>> log(m) + gamma, with gamma = Euler-Mascheroni constant, about 0.5772.

> [snip]

>>>> In the literature, a remarkable article, which may have

>>>> appeared in the Am. Math. Monthly, can be found by

>>>> searching for:

>>>> Breaking Records and Breaking Boards. Ned Glick

> ...

>>> I did long simulations for 12th Record-Breaking Values.

>>>

>>> With MatLab, I constructed a histogram of the natural

>>> logarithms of the 76,000 values:

>>>

>>> < http://img521.imageshack.us/img521/7702/records12log.jpg > .

> ...

>> S_12 is number of trials (steps) taken to find the 12th

>> Record-Breaking Value. On Average, log(S_12) is close

>> to 12 - gamma (gamma is the Euler-Mascheroni constant).

>>

>> A number of 76,000 sequences were generated, each being

>> continued until the 12th Record-Breaking Value for

>> that sequence was found. There is such variance from

>> one sample S_12 to another that I prefer the

>> quantities log(S_12) , for the histograms.

>>

>> Occasionally, an unusually high record is attained

>> in the 1st, 2nd, ... or 11th Record-Breaking Value.

>> That makes breaking the record all the more difficult.

>> In the simulations, the computer would pass (say) three

>> hours or more on the same sequence, with no new output

>> to the file for three or more hours.

>

> You may already have done so, but if not and if you are going to

> run more simulations, consider (a) profiling the code, and

> (b) trying different compilation options. (a) allows you to find

> out which lines of code use most of the hours of CPU time, so you

> can try alternate ways of coding them. Under (b), starting with

> the same random seed in each case, try optimization options -O1,

> -O2, -O3, -Ofast, timing the execution and also verifying the

> same results. From my interpretation of URL below, those 4

> optimization options are all you need to try.

> <http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html>

>

Maybe I've spent enough computer time on this distribution

problem. I'm grateful for your posts in reply.

I think I found something known about the limiting distribution

of the k'th record-breaking trial number from the article of

Ned Glick:

From Ned Glick's (UBC) 1978 article:

N_r is the r'th record-breaking time (serial number

of the trial). N_1 = 1 always.

"Also Renyi [30] stated a "central limit theorem" for the random

variables N_r: as r --> oo, the

distribution of (ln(N_r) - r)/sqrt(r) is

asymptotically normal with mean = 0 and variance= 1."

Ned Glick, "Breaking Records and Breaking Boards", 1978

In Renyi's article,

(log(nu_k) - k)/sqrt(k) ~ N(0,1) as k -> oo

"Theorie des elements saillants d'une suite d'observations",

1962.

David Bernier

--

$apr1$LJgyupye$GZQc9jyvrdP50vW77sYvz1