Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Kolmogorov-Smirnov-Lilliefors Test statistics
Replies: 10   Last Post: Jun 8, 2013 7:39 PM

 Messages: [ Previous | Next ]
 Luis A. Afonso Posts: 4,758 From: LIsbon (Portugal) Registered: 2/16/05
Re: Kolmogorov-Smirnov-Lilliefors Test statistics
Posted: Sep 18, 2012 2:58 PM

The matter is the Goodness of Fit tests.

An intuitive rule: the larger the sample size, the more the warranty that we are able to rightly ascribe the Distribution. It?s not hard to perceive that with a large number of items the sample contains the total, or so, information about the source Distribution features.

___POWER %____towards Fisk samples__/1´000´000_
Fisk (0,1,2), pg. A-37 : Regress+ : A Compendium of Common Probability Distributions.
www.causascientia.org/math_stat/Dists/Compendium.pdf

_____________Lilliefors Test_____Second Maximum
_____________ 5%____ 1%___
__n=05_______19.5____9.1_______20.0_____9.4____
__n=10_______43.4___26.3_______46.7____30.0____
__n=15_______60.7___42.7_______64.5____46.6____
__n=20_______74.0___57.4_______76.8____60.5____
__n=25_______83.3___68.7_______85.1____71.2____
__n=30_______89.1___77.5_______90.5____79.6____
__n=35_______93.1___84.1_______94.1____85.7____
__n=40_______95.8___89.0_______96.4____90.3____
__n=45_______97.5___92.7_______97.8____93.4____
__n=50_______98.5___94.9_______98.7____95.7____

The main conclusion is that the second difference is at least as good as the Lilliefors (Kolmogorov-Smirnov) test statistics. Even it?s not silly to suppose that could be preferable when we suspect an outlier is present.

The Fisk Distribution , or Log-logistic, can be found as well at:
en.wikipedia.org/wiki/Log-logistic_distribution

Luis A. Afonso

REM "FIGO"
CLS
PRINT " ************** FIGO "
DEFDBL A-Z
PRINT " 1st maximum and 2nd KOLMOGOROV - SMIRNOV -";
PRINT "LILLIEFORS test statistics "

REM " LARGEST, 2nd
REM " 5% 1% "
REM " ______________ _______________
DATA .3427,.2498,.3959,.2767 : REM n=5
DATA .2616,.2123,.3037,.2475 : REM n=10
DATA .2196,.1866,.2545,.2191 : REM n=15
DATA .1920,.1682,.2226,.1975 : REM n=20
DATA .1726,.1543,.2010,.1815 : REM n=25
DATA .1590,.1434,.1848,.1686 : REM n=30
DATA .1478,.1346,.1720,.1582 : REM n=35
DATA .1386,.1272,.1616,.1495 : REM n=40
DATA .1309,.1211,.1525,.1422 : REM n=45
DATA .1246,.1155,.1457,.1355 : REM n=50
REM
REM Y--> THE SUB-MAXIMUM
REM
REM
PRINT : PRINT
COLOR 12
pi = 4 * ATN(1): c = 1 / SQR(2 * pi)
INPUT " n (SAMPLE SIZE= 5,10,...,45, 50) = "; N
INPUT " HOW MANY SAMPLES = "; ali
pi = 4 * ATN(1): c = 1 / SQR(2 * pi)
DIM x(N), xx(N), F(N), Y(N), DIFF(N)
DIM max(8001), max2(8001)
DEF fng (z, j) = -.5 * z ^ 2 * (2 * j + 1) / ((j + 1) * (2 * j + 3))
F(0) = 0
FOR ji = 0 TO N: F(ji) = ji / N: NEXT ji
REM
REM
REM SAMPLE :
FOR SAMPLE = 1 TO ali: RANDOMIZE TIMER
mmajor = -1: second = mmajor
LOCATE 7, 50: PRINT USING "##########"; ali - SAMPLE
md = 0: sum2 = 0
REM
FOR i = 1 TO N
REM a = SQR(-2 * LOG(RND))
REM X(i) = a * COS(2 * pi * RND)
5 h = RND
IF h < 1E-20 THEN GOTO 5
u = 1 / h - 1
x = 1 / SQR(u)
x(i) = x
REM
md = md + x(i) / N
sum2 = sum2 + x(i) * x(i)
NEXT i
sqd = sum2 - N * (md ^ 2): sd = SQR(sqd / (N - 1))
FOR ii = 1 TO N: x(ii) = (x(ii) - md) / sd: NEXT ii
REM ORDERING
FOR ii = 1 TO N: u = x(ii): W = 1
FOR jj = 1 TO N
IF x(jj) < u THEN W = W + 1
NEXT jj: xx(W) = u
NEXT ii
REM "******************"
REM PHI: THE SAMPLE VALUES
FOR tt = 1 TO N: z = xx(tt)
IF z >= 0 THEN kw = 0
IF z < 0 THEN kw = 1
zu = ABS(z): s = c * zu: antes = c * zu
FOR j = 0 TO 100000
xx = antes * fng(zu, j)
s = s + xx
antes = xx
IF ABS(xx) < .00005 THEN GOTO 20
NEXT j
20 IF kw = 0 THEN FF = .5 + s
IF kw = 1 THEN FF = .5 - s
b = ABS(FF - F(tt - 1))
BB = ABS(F(tt) - FF)
MAIOR = b
IF BB > b THEN MAIOR = BB
DIFF(tt) = MAIOR
REM
REM local difference= DIFF(tt)
REM
REM
NEXT tt
HIGHER = -1
FOR ii = 1 TO N
IF DIFF(ii) <= HIGHER THEN GOTO 22
HIGHER = DIFF(ii): llocal = ii
22 NEXT ii
REM
REM MAXIMUM DIFFERENCE run-off:
DIFF(llocal) = -2
REM
HIGH = -1
FOR i2 = 1 TO N
IF DIFF(i2) <= HIGH THEN GOTO 33
HIGH = DIFF(i2)
33 NEXT i2
X95 = X95(N / 5): X99 = X99(N / 5): REM HIGHEST
Y95 = Y95(N / 5): Y99 = Y99(N / 5): REM SECOND
LOCATE 18, 45: PRINT " CRIT. VALUES "
LOCATE 19, 40: PRINT " 5% 1% "
LOCATE 20, 40: PRINT " LILLIEFORS ";
PRINT USING "##.#### "; X95; X99
LOCATE 21, 40: PRINT " SECOND ";
PRINT USING "##.#### "; Y95; Y99
REM
REM HIGHER=LARGEST HIGH= 2nd
REM
IF HIGHER > X95 THEN X1 = X1 + 1
IF HIGH > Y95 THEN X2 = X2 + 1
IF HIGHER > X99 THEN Y1 = Y1 + 1
IF HIGH > Y99 THEN Y2 = Y2 + 1
REM
NEXT SAMPLE
REM
LOCATE 9, 5: PRINT " FREQUENCIES "
LOCATE 10, 5: PRINT " LILLI 2nd "
LOCATE 11, 5: PRINT " 5% ";
PRINT USING "#.#### "; X1 / ali; X2 / ali
LOCATE 12, 5: PRINT " 1% ";
PRINT USING "#.#### "; Y1 / ali; Y2 / ali
COLOR 7
END

Date Subject Author
9/16/12 Luis A. Afonso
9/18/12 Luis A. Afonso
9/20/12 Luis A. Afonso
9/20/12 Luis A. Afonso
9/22/12 Luis A. Afonso
9/22/12 Luis A. Afonso
6/4/13 Luis A. Afonso
6/5/13 Luis A. Afonso
6/5/13 Luis A. Afonso
6/7/13 Luis A. Afonso
6/8/13 Luis A. Afonso