-----Original Message----- From: email@example.com Sent: Wednesday, July 24, 2013 3:31 AM Newsgroups: sci.stat.math Subject: Kolmogorov?Smirnov / Lilliefors test, small samples
I've been reading up on Kolmogorov-Smirnov (KS) and Lilliefors (LF) tests. I realize there are other tests, but I'm just trying to understand a sublety of the KS/LF test from an academic perspective. The test statistic is the maximum difference in the CDFs, and in a typical usage scenario, one of the two CDFs being compared is a reference distribution, often a theoretical and/or hypothesized distribution, while the other CDF is an empirical CDF from a sample (EDF). For small samples, the EDF is staircase shaped, with the left end of each stop being closed end of an interval and the right end being the open end. The thresholds for rejection are tabulated for various signifcance levels and sample sizes. The LF thresholds are generated from Monte Carlo simulation, and they take into account the fact that the test statistic is smaller when the parameters of the reference distribution are estimated from the data sample.
Whew. OK, that's all I know.
Now for the question. Let's call F0(x) the reference CDF and F1(x) the EDF to be tested against F0(x). Let the difference by deltaCDF(x). Then the test statistic is max of deltaCDF(x) over x. For small sample sizes, F1(x) has distinct steps. Many tests and visualizations evaluate a metric only at the point of data sample. If that is done for the KS/LF tests, then deltaCDF(x) is only evaluated only at x-values where the sample contains data. That would correspond the closed end (left end) of each staircase step. However, it is possible for deltaCDF(x) to increase toward the right end of each staircase step. So it is possible for the test staircase max[deltaCDF(x)] to exceed a selected threshold without the analyst knowing about it.
Is this actually a problem? I mean, theoretically it seems to be. However, if each tabulated threshold is arrived at by compiling countless cases in which max[deltaCDF(x)] is determined only at x-values in the data sample, then the theory becomes irrelevant.
Well, it is and it isn't a problem. All relevant theoretical and practical works take care of the problem by carefully defining the test statistic being used so that the problem does not arise. Notionally this just involves assessing both F1(x) and F1(x-) at each observed data point in comparison to F0(x), but it is often expressed in more computationally relevant terms. See, for example:
Biometrika Tables for Statisticians, Volume 2, p118
Empirical Processes with Applications to Statistics, by Shorak & Wellner (Wiley).
The point is an important one since, unless dealt with properly, one could end up with different results according to whether or not one chooses to multiply all data (and change the modelled distribution accordingly) by -1. If you look further, you will see that related problems of definition of test statistics for multivariate distributions arise respect to ordering and orientation of data directions of the various axes.