On 30 Oct 2006 21:26:03 -0800, "Reef Fish" <email@example.com> wrote:
> > Richard Ulrich wrote: > > On 30 Oct 2006 06:50:28 -0800, "Reef Fish" > > <firstname.lastname@example.org> wrote: > > > > > > > > stats newbie wrote: > > > > Hi, I was hoping someone would be able to explain the assumption of > > > > homogeneity of variance. What is it and why should it be addressed? > > > > What are the consequences of not having homogeneity of variance. I hope > > > > I have posetd this in the correct group. Thanks, > > RF > > > > That is a ASSUMPTION behind many different statistical methods. > > > > > > In order for the results of each method to apply, one must make sure > > > that the ASSUMPTION(s) are valid, else the statistical results based > > > the method will be all wrong. RU> > > > I would prefer to say, "the method *may* be all wrong," and I think > > that RF expresses that more relaxed idea in his closing comments, > > where some violations are more serious than others .... > > [snip, some detail] RF > > BUt those are TWO DIFFERENT sets of statements. > > In the above, it means If the ASSUMPTION(s) are NOT valid, then > the statistical results based on the method WILL be all wrong. > There is no "may be" about it. If you have two binary variables > X and Y and you test its correlation with the test statistic T for > the Pearson correlation coefficient (which would be phi for the > two binary variables), the result WILL be wrong because the > assumption is violated 100%, without question.
It is true that some violations are unmistakable.
I think it is untrue that this makes them inevitably more serious. For larger N in all cells, the 2x2 phi has a test that becomes increasingly identical to the test on the Pearson. If the result is the test, what result is "wrong"?
On the other hand, for larger N and small associations, the assumption of independence -- which may be hard to determine -- becomes increasingly important.
RF> > In the situation below, it's about the VALIDATION of the assumption. > If Normality is required of a variable, and it is not known 100% to be > nonnormal, then there is leeway in deciding what is a serious > violation and what is not,
- I'd judge, in the situation *above*, that definite non-normality can be definitely not-serious. So there is often leeway.
> because in that case (unlike the case it > does not require any thinking to know that the (0,1) variable is > NOT normal) the DATA can never prove with 100% certainty > whether it came from a Normal population or not. > > There is a BIG difference in the above two situations.
? What is the big difference? Formal reliance on the "right test"?
> > > > RF > > > > That is WHY before one runs any particular statistical procesure, one > > > should VALIDATE that the underlying assumptions are not SERIOUSLY > > > violated. One can tolerate small deviations and that's the property > > > that is called "robustness" to certain types of violations. > > RU> > > > A apparent violation of assumptions gives you a *warning* that > > some other method might be more appropriate. RF > > Or a different assumption may be appropriate, or both.
I'm not sure what the "both" should mean.
RU> > > > The violation gives you the immediate problem that the > > p-values may be wrong, in the sense that a "more appropriate" > > analysis would give something rather different. If you have a > > choice of two analyses, the easy cross-check is to see if they differ. RF > > And how do you conclude (if they differ) in your "cross-check" > what is correct and what isn't? And what do you mean by > cross-check?
If you think that Logistic Regression might fit better than Normal (Probit), you can test both ways. Same result? - no big problem.
Does an optional, debatable transformation give a different result? If there is no difference, you can report that while reporting the detailed result on either metric -- Showing that "It makes no difference" is a way of finessing an overly-conservative demand for "non-parametric" tests, in my experience.
When there *is* a different result for different analyses, then you know that the assumptions *do* matter in a way -- and to an extent -- that needs to be explained.
> > -- Reef Fish Bob. > RU > > > > The "neat" solution to "failed assumptions" occurs when one solution > > fixes all the apparent violations -- such as, when one transformation > > provides linearity, homogeneity of variance, normality (of the > > variable, or especially, of the residuals), and an "interval" scale of > > measurement. - Otherwise, you might have to invent a new > > analysis, or trying to weigh the importance of different violations. > >