Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Skewness and kurtosis p-values
Replies: 11   Last Post: May 28, 2013 6:50 AM

 Messages: [ Previous | Next ]
 Richard Ulrich Posts: 2,961 Registered: 12/13/04
Re: Skewness and kurtosis p-values
Posted: May 24, 2013 11:50 PM

On Sat, 25 May 2013 00:21:25 +0200, Cristiano <cristiapi@NSgmail.com>
wrote:

>On 24/05/2013 21:32, Rich Ulrich wrote:
>> On Fri, 24 May 2013 19:39:15 +0200, Cristiano <cristiapi@NSgmail.com>
>> wrote:
>>

>>> I calculate the skewness and the kurtosis from a set of real numbers
>>> (distribution unknown) using the formulas:
>>>
>>> http://mvpprograms.com/help/mvpstats/distributions/SkewnessCriticalValues
>>>
>>> http://mvpprograms.com/help/mvpstats/distributions/KurtosisCriticalValues
>>>
>>> I usually need to check whether the calculated skewness and kurtosis are
>>> in good agreement with the expected values for a normal or uniform
>>> distribution; I need a p-value.
>>>
>>> I'm trying to replicate (via simulation) the p-values (alpha) presented
>>> in that site, but I get different values. For example, for n= 7 and
>>> alpha= 0.1, for the skewness I get 1.169 instead of 1.307.
>>>
>>> For the skewness I do the following:
>>> 1) generate a random number x_i in N(0,1)
>>> 2) if x_i < 0 discard the number
>>> 3) for n= 7 I do the above steps until i = 1428571
>>> 4) calculate the 95th percentile (for alpha= 0.1) of the x's.
>>>
>>> Does anybody know where I could be wrong?

>>
>> My tentative guess is that you cut-and-paste'd your
>> steps from some wrong source.

>
>I wrote a C++ working program; I "extracted" the steps from there.
>

>> Discarding negative numbers has nothing to do with
>> computing skewness, so far as I can imagine.

>
>The steps are a bit inaccurate.
>I meant that I discard the skewness < 0.
>

>> Somewhere in the steps, you should "compute skewness."
>>
>> 1) Draw 7; compute skewness; save.
>> 2) Repeat 100,000 times.
>> 3) Show 5% and 95% points (should be nearly the same absolute values).
>> 3) Repeat 10 times.

>
>Yes, I do that, but to be more precise:
>1) Draw 7; compute skewness;
>2) if skewness < 0 discard the value, else save.

Depending on what you mean by "discard," this might
introduce some unknown bias. Do you keep the count?
There will never be *exactly* 50% of the sample with
skewness less than 0.

>3) Repeat 100,000 times.
>4) Show 95% points.
>5) Repeat until the confidence limit is good.

"good"? Mostly, I haven't seen formal statements for how the
precision was computed in similar MC studies. Often, people
show enough parallel results that the technical error is apparent,
but I like to look at the actual limits when I do the work.

>
>The reason to discard skewness < 0 is that I need to calculate only a
>critical value for the skewness (the distribution must be exactly
>symmetrical); if I get 5th percentile = -0.123 and 95th percentile =
>.124, which critical value should I take?

As you say, the distribution *ought* to be exactly symmetrical.

The lower limit provides a second value based on 100,000
replications. (1) Why ignore it? (2) If there were some bias
in your RNG that these computations brought out, it would be
important to know it. (3) When you compute 10 or 20 cut-offs,
you can compute a pragmatic standard error, to go along with
the theoretical one (based on ranks around the 5% cutoff).

Back when computers were 1000 times slower than today, I was
reading some computer science literature. Cutting an eight-hour
monte carlo job in half would have been a worth-while benefit of
using both ends of the distribution, even without the cross-check
on validity (from actually *looking* at both values). No reader
would have complained.

That all being said -- I don't know why your results don't agree
with the page you cite. Before I looked at the page, I
wondered at potential differences in definitions of "skewness".

However, they seem very explicit in what is being computed.
You *would* get slightly different results if you don't compute
the moments around the observed means for each set (but
assumed zero).

--
Rich Ulrich

Date Subject Author
5/24/13 Cristiano
5/24/13 Richard Ulrich
5/24/13 Cristiano
5/24/13 Richard Ulrich
5/25/13 Cristiano
5/25/13 Cristiano
5/25/13 David Jones
5/25/13 Cristiano
5/25/13 David Jones
5/26/13 Richard Ulrich
5/27/13 Cristiano
5/28/13 Luis A. Afonso