On Sat, 25 May 2013 00:21:25 +0200, Cristiano <cristiapi@NSgmail.com> wrote:
>On 24/05/2013 21:32, Rich Ulrich wrote: >> On Fri, 24 May 2013 19:39:15 +0200, Cristiano <cristiapi@NSgmail.com> >> wrote: >> >>> I calculate the skewness and the kurtosis from a set of real numbers >>> (distribution unknown) using the formulas: >>> >>> http://mvpprograms.com/help/mvpstats/distributions/SkewnessCriticalValues >>> >>> http://mvpprograms.com/help/mvpstats/distributions/KurtosisCriticalValues >>> >>> I usually need to check whether the calculated skewness and kurtosis are >>> in good agreement with the expected values for a normal or uniform >>> distribution; I need a p-value. >>> >>> I'm trying to replicate (via simulation) the p-values (alpha) presented >>> in that site, but I get different values. For example, for n= 7 and >>> alpha= 0.1, for the skewness I get 1.169 instead of 1.307. >>> >>> For the skewness I do the following: >>> 1) generate a random number x_i in N(0,1) >>> 2) if x_i < 0 discard the number >>> 3) for n= 7 I do the above steps until i = 1428571 >>> 4) calculate the 95th percentile (for alpha= 0.1) of the x's. >>> >>> Does anybody know where I could be wrong? >> >> My tentative guess is that you cut-and-paste'd your >> steps from some wrong source. > >I wrote a C++ working program; I "extracted" the steps from there. > >> Discarding negative numbers has nothing to do with >> computing skewness, so far as I can imagine. > >The steps are a bit inaccurate. >I meant that I discard the skewness < 0. > >> Somewhere in the steps, you should "compute skewness." >> >> 1) Draw 7; compute skewness; save. >> 2) Repeat 100,000 times. >> 3) Show 5% and 95% points (should be nearly the same absolute values). >> 3) Repeat 10 times. > >Yes, I do that, but to be more precise: >1) Draw 7; compute skewness; >2) if skewness < 0 discard the value, else save.
Depending on what you mean by "discard," this might introduce some unknown bias. Do you keep the count? There will never be *exactly* 50% of the sample with skewness less than 0.
>3) Repeat 100,000 times. >4) Show 95% points. >5) Repeat until the confidence limit is good.
"good"? Mostly, I haven't seen formal statements for how the precision was computed in similar MC studies. Often, people show enough parallel results that the technical error is apparent, but I like to look at the actual limits when I do the work.
> >The reason to discard skewness < 0 is that I need to calculate only a >critical value for the skewness (the distribution must be exactly >symmetrical); if I get 5th percentile = -0.123 and 95th percentile = >.124, which critical value should I take?
As you say, the distribution *ought* to be exactly symmetrical.
The lower limit provides a second value based on 100,000 replications. (1) Why ignore it? (2) If there were some bias in your RNG that these computations brought out, it would be important to know it. (3) When you compute 10 or 20 cut-offs, you can compute a pragmatic standard error, to go along with the theoretical one (based on ranks around the 5% cutoff).
Back when computers were 1000 times slower than today, I was reading some computer science literature. Cutting an eight-hour monte carlo job in half would have been a worth-while benefit of using both ends of the distribution, even without the cross-check on validity (from actually *looking* at both values). No reader would have complained.
That all being said -- I don't know why your results don't agree with the page you cite. Before I looked at the page, I wondered at potential differences in definitions of "skewness".
However, they seem very explicit in what is being computed. You *would* get slightly different results if you don't compute the moments around the observed means for each set (but assumed zero).