On 25/05/2013 5:50, Rich Ulrich wrote: >> Yes, I do that, but to be more precise: >> 1) Draw 7; compute skewness; >> 2) if skewness < 0 discard the value, else save. > > Depending on what you mean by "discard,"
> this might introduce some unknown bias. Do you keep the count? > There will never be *exactly* 50% of the sample with > skewness less than 0.
Sure, but where's the problem?
>> The reason to discard skewness < 0 is that I need to calculate only a >> critical value for the skewness (the distribution must be exactly >> symmetrical); if I get 5th percentile = -0.123 and 95th percentile = >> .124, which critical value should I take? > > As you say, the distribution *ought* to be exactly symmetrical. > > The lower limit provides a second value based on 100,000 > replications. (1) Why ignore it? (2) If there were some bias > in your RNG that these computations brought out, it would be > important to know it.
The RNG I use doesn't have any bias. I checked that using properly designed tests and I check the simulation using a properly designed generator.
> (3) When you compute 10 or 20 cut-offs, > you can compute a pragmatic standard error, to go along with > the theoretical one (based on ranks around the 5% cutoff). > > Back when computers were 1000 times slower than today, I was > reading some computer science literature. Cutting an eight-hour > monte carlo job in half would have been a worth-while benefit of > using both ends of the distribution, even without the cross-check > on validity (from actually *looking* at both values). No reader > would have complained.
I don't have any problem in using both tails, but does it make any sense? We already know that the critical values for the 5th and 95th percentile *must* be exactly the same. For example, using both tails I get: 0.05 -.82306 +/- 2.75e-4 0.95 .82311 +/- 2.73e-4 (+/- indicates the confidence interval) The p-value have to come from a 2-sided test; there should be only one critical value. Where's the sense in using -.82306 and .82311?
> That all being said -- I don't know why your results don't agree > with the page you cite. Before I looked at the page, I > wondered at potential differences in definitions of "skewness".
If it's not too much trouble, you just need to click the links to see that the pages show also the formulas.
> However, they seem very explicit in what is being computed. > You *would* get slightly different results if you don't compute > the moments around the observed means for each set (but > assumed zero).
I calculated the above critical values using mean= 0, while when I calculate them using the sample mean I get: 0.05 -.81661 +/- 2.79e-4 0.95 .81637 +/- 2.74e-4 There are significant differences, but the values are very far away from those tabulated in that site. How that can be possible? I'm not interested in using the values in the site, but I need to understand whether my simulation works fine.
If someone can confirm that the following procedure is good, I can stop asking and I can start the simulation:
1) Randomly draw N normally (or uniformly) distributed numbers 2) compute the skewness (or the kurtosis) 2a) [if skewness < 0 discard the value, else save] 3) Repeat many times 4) calculate the p-th percentile of the saved skewness (or kurtosis) 5) Repeat until the confidence interval for the p-th percentile is "good".
[I can calculate when "good" is good.]
Step 2a: for the kurtosis I need 2 critical values, but for the skewness do I really need 2 critical values?