Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.


Math Forum
»
Discussions
»
sci.math.*
»
sci.stat.math
Notice: We are no longer accepting new posts, but the forums will continue to be readable.
Topic:
Skewness and kurtosis pvalues
Replies:
11
Last Post:
May 28, 2013 6:50 AM




Re: Skewness and kurtosis pvalues
Posted:
May 24, 2013 11:50 PM


On Sat, 25 May 2013 00:21:25 +0200, Cristiano <cristiapi@NSgmail.com> wrote:
>On 24/05/2013 21:32, Rich Ulrich wrote: >> On Fri, 24 May 2013 19:39:15 +0200, Cristiano <cristiapi@NSgmail.com> >> wrote: >> >>> I calculate the skewness and the kurtosis from a set of real numbers >>> (distribution unknown) using the formulas: >>> >>> http://mvpprograms.com/help/mvpstats/distributions/SkewnessCriticalValues >>> >>> http://mvpprograms.com/help/mvpstats/distributions/KurtosisCriticalValues >>> >>> I usually need to check whether the calculated skewness and kurtosis are >>> in good agreement with the expected values for a normal or uniform >>> distribution; I need a pvalue. >>> >>> I'm trying to replicate (via simulation) the pvalues (alpha) presented >>> in that site, but I get different values. For example, for n= 7 and >>> alpha= 0.1, for the skewness I get 1.169 instead of 1.307. >>> >>> For the skewness I do the following: >>> 1) generate a random number x_i in N(0,1) >>> 2) if x_i < 0 discard the number >>> 3) for n= 7 I do the above steps until i = 1428571 >>> 4) calculate the 95th percentile (for alpha= 0.1) of the x's. >>> >>> Does anybody know where I could be wrong? >> >> My tentative guess is that you cutandpaste'd your >> steps from some wrong source. > >I wrote a C++ working program; I "extracted" the steps from there. > >> Discarding negative numbers has nothing to do with >> computing skewness, so far as I can imagine. > >The steps are a bit inaccurate. >I meant that I discard the skewness < 0. > >> Somewhere in the steps, you should "compute skewness." >> >> 1) Draw 7; compute skewness; save. >> 2) Repeat 100,000 times. >> 3) Show 5% and 95% points (should be nearly the same absolute values). >> 3) Repeat 10 times. > >Yes, I do that, but to be more precise: >1) Draw 7; compute skewness; >2) if skewness < 0 discard the value, else save.
Depending on what you mean by "discard," this might introduce some unknown bias. Do you keep the count? There will never be *exactly* 50% of the sample with skewness less than 0.
>3) Repeat 100,000 times. >4) Show 95% points. >5) Repeat until the confidence limit is good.
"good"? Mostly, I haven't seen formal statements for how the precision was computed in similar MC studies. Often, people show enough parallel results that the technical error is apparent, but I like to look at the actual limits when I do the work.
> >The reason to discard skewness < 0 is that I need to calculate only a >critical value for the skewness (the distribution must be exactly >symmetrical); if I get 5th percentile = 0.123 and 95th percentile = >.124, which critical value should I take?
As you say, the distribution *ought* to be exactly symmetrical.
The lower limit provides a second value based on 100,000 replications. (1) Why ignore it? (2) If there were some bias in your RNG that these computations brought out, it would be important to know it. (3) When you compute 10 or 20 cutoffs, you can compute a pragmatic standard error, to go along with the theoretical one (based on ranks around the 5% cutoff).
Back when computers were 1000 times slower than today, I was reading some computer science literature. Cutting an eighthour monte carlo job in half would have been a worthwhile benefit of using both ends of the distribution, even without the crosscheck on validity (from actually *looking* at both values). No reader would have complained.
That all being said  I don't know why your results don't agree with the page you cite. Before I looked at the page, I wondered at potential differences in definitions of "skewness".
However, they seem very explicit in what is being computed. You *would* get slightly different results if you don't compute the moments around the observed means for each set (but assumed zero).
 Rich Ulrich



