Date: Jan 28, 2013 2:28 PM Author: Richard Ulrich Subject: Re: two-sample nonparametric test on quantiles On Mon, 28 Jan 2013 10:21:21 -0800 (PST), "Mickey M."

<cz3728@gmail.com> wrote:

>I think I have to precise my quenstion:

>

>Suppose we have X1, ..., Xn i.i.d from the distribution F and

>Y1, ..., Ym i.i.d from the distribution G.

>

>[and also X a random variable from F and Y ~ G]

>

>The standard Mann-Whitney test tests the null hypothesis

>H0: F=G

>against

>H1: F different from G

>or against the one-sided hypothesis

>H1': F stochastically greater than G (i.e. P(X>Y) > 0.5)

>

>I would like to know, whether it is possible to prove that Mann-Whitney test

>can be used in fact to test:

>

>H0: P(X>Y) = 0.5

>against

>H1: P(X>Y) \neq 0.5

>or H1': P(X>Y) > 0.5

>

>(i.e. whether it is possible to relax the H0 hypothesis, F and G not necessarily the same under H0)

>

>When I looked on the proof of M-W in my textbook, it seems to me that the assumption F=G (H0) is essential to prove the distributional properties of the Mann-Whitney U statistics. But I have seen the second variant of the test somewhere on the internet (without proof)....

>

I think this article by Morten W. Fagerland has an answer to your

question. Bruce Weaver posted this reference not long ago

on the SPSS list.

http://www.biomedcentral.com/1471-2288/12/78

The article shows that, for large samples, the MW test

shows a large sensitivity to the difference in Shape or

Variance instead of (only) the desired difference in Location.

t-tests, non-parametric tests, and large

studies a paradox of statistical practice?

[from the discussion]

"Furthermore, if the results from the WMW test are interpreted

strictly according to the tests null hypothesis, Prob(X<Y)=0.5, the

WMW test is an efficient and useful test. For large studies,

however, where the purpose is to compare the means of continuous

variables, the choice of test is easy: the t-test is robust even to

severely skewed data and should be used almost exclusively."

I will point out that Conover showed that the MW is asymptotically

equivalent to an ANOVA test on the rank-transformed data.

That has the implication that deviations are measured in squared-

distances, rather than interchange-distances. This is the essential

difference between the Kendall rank-order correlation and the

Spearman rank-order correlation, where the Spearman is an ANOVA-

type statistic, since it can be computed as a Pearson r on the

rank-transformed variables. I have posted before about this

distinction between the Spearman and the Kendall, and why they

do have different rejection regions.

Thus, while Fagerland says that the WMF is "efficient and useful"

for that hypothesis you state, it is not the "least" parametric, which

would possibly be Kendall's tau-c, which corrects for ties (using

group membership as a 0/1 variable).

--

Rich Ulrich