Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.


Luis A. Afonso
Posts:
4,758
From:
LIsbon (Portugal)
Registered:
2/16/05


Testing normality by Skewness and Kurtosis: a new focusing
Posted:
Oct 24, 2013 12:51 PM


Testing normality by Skewness and Kurtosis: a new focusing
___0___Preliminaries. Aim.
We must agree that the socalled JarqueBera Test is wrong: it fails its aim of checking normality through twoparameters estimation, the Skewness S and (Excess) Kurtosis, k. The reason being elementary, surprisingly, I did not find any objection at this point. In fact how to perform the parameter´s sum, in order to obtain the test statistics, we immediately lose the individual capacity they are, or not, well fitted to normality, the Null Hypothesis. The only claim we found concerns the weak power the test shows but the main reason of such feature was not disclosed. The problem can be solved keeping the two parameters at work, but estimating the individual worth against the Null Hypothesis, in order not to allow a tradeoff between them, as the additive procedure does. The JarqueBera falsity can be expressed easily: Suppose that S and k are such that the parameters sum U´´ + V´´ are exactly equal to the JB critical value, noted JBcrit. Then we conclude that, whatever x, we have always (U``+ x) + (V`` x) = JBcrit, a not rejection condition, no matter the latter parameters oddness in what concerns normality. ___1___The new paradigm
The procedure begins to obtain, by simulation, a nottootight net of S, k critical values, by simulation, individually, for example regarding alpha=0.095(0.005)0.055 and can be supplied by sufficiently detailed Tables, therefore not using <Skcrit> routine, below . The second stage consists in to evaluate how each pair S, k, is situated, inside or outside, the Confidence Intervals, given by the above bonds, and chose the alpha which contains, approximately, 95% of the simulated pairs. Because these bounds are intrinsically a propriety of all normal samples we can consider with the size chosen, of course.
Follows an example of how to find out the optimal D and k 95% confidence intervals.
Example, size 10 samples: __n=10___________S__________k_______________p____ Alpha = 0.095___+/.65____1.01,1.29__________0.932 ??= 0.090___+/.67____1.02,1.34__________0.937 ??= 0.085___+/.69____1.04,1.39__________0.938 ??= 0.080___+/.70 ___ 1.05,1.44__________0.944 ??= 0.075___+/.72 ___ 1.07,1.50__________0.944 ??= 0.070___+/.74 ___ 1.08,1.56__________0.9495 ??= 0.065___+/.76 ___ 1.10,1.63__________0.9544 ??= 0.060___+/.79 ___ 1.11,1.70__________0.957 ??= 0.055___+/.81 ___ 1.13,1.78__________0.961 ?? The acceptance intervals are [+/ 0.742] and [1.081, 1.560] for all n=10 normal samples, see <What>. To real world samples with both S and k estimated parameters inside are likely normal. Note that classical Statistics NHST are dealing only with necessary results, never sufficient ones.
Luis A. Afonso
REM "SKcrit" CLS PRINT : PRINT "_______Skcrit (critical values)_______" DEFDBL AZ RANDOMIZE TIMER pi = 4 * ATN(1) INPUT " size = "; n INPUT " all = "; all REM cs = 1 / (n  2) * SQR(n * (n  1)) cn = ((n  1) * (n + 1)) / ((n  2) * (n  3)) cnn = ((n  1) * (n  1)) / ((n  2) * (n  3)) DIM scv(8001), kcv(8000) DIM X(n) FOR j = 1 TO all REM LOCATE 4, 50: PRINT USING "########"; all  j m = 0: m(2) = 0: m(3) = 0: m(4) = 0 FOR i = 1 TO n aa = SQR(2 * LOG(RND)) X(i) = aa * COS(2 * pi * RND) m = m + X(i) / n NEXT i FOR k = 2 TO 4 FOR i = 1 TO n: d = X(i)  m m(k) = m(k) + d ^ k / n NEXT i: NEXT k S0 = m(3) / ((m(2) ^ 1.5)) s = cs * S0 k = cn * m(4) / (m(2) * m(2))  3 * cnn suv = INT(1000 * (s + 4) + .5) IF suv > 8000 THEN suv = 8000 kuv = INT(1000 * (k + 4) + .5) IF kuv > 8000 THEN kuv = 8000 REM scv(suv) = scv(suv) + 1 / all kcv(kuv) = kcv(kuv) + 1 / all NEXT j REM PRINT " "; REM PRINT " skewness excess kurtosis " REM c(1, 1) = .095: c(1, 2) = 1  c(1, 1) c(2, 1) = .09: c(2, 2) = 1  c(2, 1) c(3, 1) = .085: c(3, 2) = 1  c(3, 1) c(4, 1) = .08: c(4, 2) = 1  c(4, 1) c(5, 1) = .075: c(5, 2) = 1  c(5, 1) c(6, 1) = .07: c(6, 2) = 1  c(6, 1) c(7, 1) = .065: c(7, 2) = 1  c(7, 1) c(8, 1) = .06: c(8, 2) = 1  c(8, 1) c(9, 1) = .055: c(9, 2) = 1  c(9, 1) REM FOR ju = 1 TO 9 REM COLOR ju + 1 FOR kk = 1 TO 2 s = 0 FOR t = 0 TO 8000 s = s + scv(t) IF s > c(ju, kk) THEN GOTO 1 NEXT t 1 LOCATE 3 + 2 * ju + kk, 10 a(ju, kk) = t / 1000  4 PRINT USING "##.### .### "; a(ju, kk); s; NEXT kk REM FOR kk = 1 TO 2 s = 0 FOR t = 0 TO 8000 s = s + kcv(t) IF s > c(ju, kk) THEN GOTO 2 NEXT t 2 LOCATE 3 + 2 * ju + kk, 30 PRINT USING "##.### .### "; t / 1000  4; s; NEXT kk NEXT ju COLOR 7 END
REM "WHAT" CLS PRINT : PRINT "_______WHAT_______" DEFDBL AZ RANDOMIZE TIMER pi = 4 * ATN(1) INPUT " size = "; n INPUT " CI S "; bound1, bound2 INPUT " k "; bounty1, bounty2 INPUT " all = "; all REM cs = 1 / (n  2) * SQR(n * (n  1)) cn = ((n  1) * (n + 1)) / ((n  2) * (n  3)) cnn = ((n  1) * (n  1)) / ((n  2) * (n  3)) DIM scv(8001), kcv(8000) DIM X(n) FOR j = 1 TO all REM LOCATE 10, 50: PRINT USING "########"; all  j m = 0: m(2) = 0: m(3) = 0: m(4) = 0 FOR i = 1 TO n aa = SQR(2 * LOG(RND)) X(i) = aa * COS(2 * pi * RND) m = m + X(i) / n NEXT i FOR k = 2 TO 4 FOR i = 1 TO n: d = X(i)  m m(k) = m(k) + d ^ k / n NEXT i: NEXT k S0 = m(3) / ((m(2) ^ 1.5)) s = cs * S0 k = cn * m(4) / (m(2) * m(2))  3 * cnn pinn = 0 IF s < bound1 OR s > bound2 THEN pinn = pinn + 1 IF k < bounty1 OR k > bounty2 THEN pinn = pinn + 1 IF pinn = 2 THEN bothout = bothout + 1 / all NEXT j PRINT USING "inside = #.####"; 1  bothout END



