The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* » sci.stat.math

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: kolmogov-smirnov, wilcoxon and kruskal tests
Replies: 14   Last Post: Dec 31, 2012 6:38 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Herman Rubin

Posts: 399
Registered: 2/4/10
Re: kolmogov-smirnov, wilcoxon and kruskal tests
Posted: Dec 31, 2012 1:38 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On 2012-12-30, Rich Ulrich <> wrote:
> On Sun, 30 Dec 2012 13:12:36 -0800 (PST), wrote:
> [snip, before and after]

>>Making the story short, I am missing detailed cookbook description of
test saying clearly what are the assumptions and what are the null and
alternative hypothesis.

> It would be nice if such a thing exists. I don't remember
> ever seeing any collection like that. Maybe someone else
> knows of something.

> I do remember the comment from one of my first stat
> teachers, that you don't understand a test until you know
> why it rejects when its competitors do not, and vice-versa.
> I usuallly learned those details by looking at several forms of
> the computation formulas for the tests, and noting where
> the difference exists. - You probably need some background
> in statistical estimation theory to have it all make best sense.

> You do not mention the more subtle points that can arise with
> the tests you named.

> The Wilcoxon is exactly the same as the Kruskal-Wallis when the
> latter is applied to two groups: If you see any difference in their
> reported p-values, it will usually be because the two algorithms
> have not made precisely the same (approximate) adjustment for
> the correction for ties. Or else, the two are using different
> algorithms for either the small-sample, exact value, or for the
> large-sample approximation.

> The KS test also starts with ranks, but it uses a *single* point
> of extreme difference for its test. And the usual tables do not
> apply exactly when the data features ties. So this test differs
> from the other two because it has a different criterion. With a
> bit more generality, I think we might say that it has a different
> "loss function" for measuring departure the null.

If there are ties, the KS table are conservative. That is, the
probability of rejection under the null is somewhat less than the
value for continuity. You can see this by "spreading out" the
points of positive probability, which means that there well be
additional points at which to calculate the difference.

> For some other tests:

> It is common to see t-test presented with tests for pooled vs
> separate variances. Can you tell by looking at SDs and Ns which
> test will be "more powerful" for given comparison? [assumption]

The more degrees of freedom, the more power. However, pooled
variances make the assumption that the variances are equal, which
can be a very strong assumption.

There are ways of getting a rancomized t-test for the case of
different populations which is valid no matter what the several
variances happen to be; however, it only has the smallest number
of degrees of freedom.

> It is common (SPSS, say) to see a contingency table with both the
> Pearson chisquared test and the Likelihood test. Do you know which
> test is more sensitive to which kind of difference? [criterion]

The two tests are asymptotically the same. With reasonably
large samples, there should be little difference.

> Can you construct a set of paired data for which the paired
> t-test is less powerful than the separate-groups t-test?
> [assumption]
> (Is it ever fair to ignore the knowledge that these data are
> correlated?)

If the correlation is negative, the paired will be less powerful.

> The Spearman and Kendall coefficients for rank-correlation
> do not have the same rejection area. Do you have a reason
> for selecting one over the other? [criterion]

The Kendall coefficient is VERY close to normal for reasonable
sample sizes. The two coefficients are highly correlated; I
seem to recall that the correlation approaches one as the
sample size increases.

This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University Phone: (765)494-6054 FAX: (765)494-0558

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.