
Re: problem on recordbreaking values in probability
Posted:
Apr 14, 2013 12:17 AM


On 04/13/2013 11:51 PM, David Bernier wrote: > On 03/11/2013 12:32 PM, David Bernier wrote: >> On 03/11/2013 02:07 AM, James Waldby wrote: >>> On Sun, 10 Mar 2013 21:47:27 0400, David Bernier wrote: >>>> On 03/10/2013 08:52 PM, David Bernier wrote: >>>>> On 03/01/2013 08:41 AM, David Bernier wrote: >>>>>> On 02/27/2013 10:24 PM, David Bernier wrote: >>>>>>> On 02/27/2013 04:05 PM, James Waldby wrote: >>>>>>>> On Wed, 27 Feb 2013 07:10:08 0500, David Bernier wrote: >>>>>>>>> On 02/27/2013 05:49 AM, David Bernier wrote: >>>>>>>>>> On 02/27/2013 05:31 AM, David Bernier wrote: >>>>>>>>>>> I used Marsaglia's 64bit SUPER KISS pseudorandom number >>>>>>>>>>> generator >>>>>>>>>>> to simulate uniform r.v.s on [0, 1] that are independent, as >>>>>>>>>>> X_1, X_2, X_3, ad infinitum >>>>>>>>>>> >>>>>>>>>>> For each go, (or sequence) I define its 1st recordbreaking >>>>>>>>>>> value >>>>>>>>>>> as R(1) as X_1, its 2nd recordbreaking value R(2) as the >>>>>>>>>>> value taken by X_n for the smallest n with X_n > X_1, and in >>>>>>>>>>> general >>>>>>>> [ R(k+1) is the value taken by X_n for the smallest n with X_n > >>>>>>>> R(k)] >>>>>>>> ... >>> [snip] >>>>>>>> [etc] >>>>>>>> >>>>>>>> It would be useful to report the number of trials each simulation >>>>>>>> took to find its 20th RBV. If a simulation takes m trials, the >>> [snip] >>>>>>>> In following, let L(n) = Pr(n'th item of n is lowest). >>>>>>>> (Distribution >>>>>>>> of the lowest item should be similar to distribution of 1(highest >>>>>>>> item).) I suppose that L(n) = 1/n and that the expected value of >>>>>>>> the >>>>>>>> number of recordlowvalues (RLV's) in m trials is sum{i=1 to >>>>>>>> m}(1/i), >>>>>>>> or about H_m, the m'th harmonic number, which can be >>>>>>>> approximated by >>>>>>>> log(m) + gamma, with gamma = EulerMascheroni constant, about >>>>>>>> 0.5772. >>> [snip] >>>>>> In the literature, a remarkable article, which may have >>>>>> appeared in the Am. Math. Monthly, can be found by >>>>>> searching for: >>>>>> Breaking Records and Breaking Boards. Ned Glick >>> ... >>>>> I did long simulations for 12th RecordBreaking Values. >>>>> >>>>> With MatLab, I constructed a histogram of the natural >>>>> logarithms of the 76,000 values: >>>>> >>>>> < http://img521.imageshack.us/img521/7702/records12log.jpg > . >>> ... >>>> S_12 is number of trials (steps) taken to find the 12th >>>> RecordBreaking Value. On Average, log(S_12) is close >>>> to 12  gamma (gamma is the EulerMascheroni constant). >>>> >>>> A number of 76,000 sequences were generated, each being >>>> continued until the 12th RecordBreaking Value for >>>> that sequence was found. There is such variance from >>>> one sample S_12 to another that I prefer the >>>> quantities log(S_12) , for the histograms. >>>> >>>> Occasionally, an unusually high record is attained >>>> in the 1st, 2nd, ... or 11th RecordBreaking Value. >>>> That makes breaking the record all the more difficult. >>>> In the simulations, the computer would pass (say) three >>>> hours or more on the same sequence, with no new output >>>> to the file for three or more hours. >>> >>> You may already have done so, but if not and if you are going to >>> run more simulations, consider (a) profiling the code, and >>> (b) trying different compilation options. (a) allows you to find >>> out which lines of code use most of the hours of CPU time, so you >>> can try alternate ways of coding them. Under (b), starting with >>> the same random seed in each case, try optimization options O1, >>> O2, O3, Ofast, timing the execution and also verifying the >>> same results. From my interpretation of URL below, those 4 >>> optimization options are all you need to try. >>> <http://gcc.gnu.org/onlinedocs/gcc/OptimizeOptions.html> >>> >> >> Maybe I've spent enough computer time on this distribution >> problem. I'm grateful for your posts in reply. >> >> I think I found something known about the limiting distribution >> of the k'th recordbreaking trial number from the article of >> Ned Glick: >> >> From Ned Glick's (UBC) 1978 article: >> >> N_r is the r'th recordbreaking time (serial number >> of the trial). N_1 = 1 always. >> >> "Also Renyi [30] stated a "central limit theorem" for the random >> variables N_r: as r > oo, the >> distribution of (ln(N_r)  r)/sqrt(r) is >> asymptotically normal with mean = 0 and variance= 1." >> Ned Glick, "Breaking Records and Breaking Boards", 1978 >> >> >> >> In Renyi's article, >> >> (log(nu_k)  k)/sqrt(k) ~ N(0,1) as k > oo >> >> "Theorie des elements saillants d'une suite d'observations", >> 1962. > [...] > > "Advances in Combinatorial Methods and > Applications to Probability and Statistics" > edited by N. Balakrishnan, > Google Books, > > > http://books.google.ca/books/about/Advances_in_Combinatorial_Methods_and_Ap.html?id=WdJSxINF7VIC > > > Part II, Applications to Probability Problems, > Chapter 13, "Stirling Numbers and Records", > pp. 189201 approximately. > > In the "classical scheme" of records, > alpha_1 = alpha_2 = ... alpha_n . > > Then Prob[xi_n = 1] = 1/n. > > xi_n is the indicator function for the n'th r.v. > (n = 1, 2, 3, ... ). > xi_n = 1 if X_n is a record, 0 otherwise. > > X_1, X_2, ... are i.i.d. uniform [0, 1] r.v.s (say). > > Formulas (13.19) and (13.19) give formulas > for probability distributions related to > record values. > > I think using Generalized Stirling numbers > makes it harder to understand. > > But "Stirling Numbers and Records" is memorable. > > David Bernier >
There is also the book "Records:mathematical theory" by Valery B. Nevzorov.
Probably th Nevzorov who wrote a chapter in "Advances in Combinatorial Methods and Applications to Probability and Statistics"
"Records:mathematical theory":
http://books.google.ca/books/about/Records.html?id=B7ZpJKouceoC&redir_esc=y
dave
 Jesus is an Anarchist.  J.R.

