Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Upside down Beginner Statistics Prioity
Replies: 14   Last Post: Jul 29, 2012 10:12 AM

 Messages: [ Previous | Next ]
 Douglas 73 Posts: 52 From: Pacific northwest, USA Registered: 1/15/11
Upside down Beginner Statistics Prioity
Posted: Jul 11, 2012 9:41 AM

In the beginning (of Statistics 101) there was a cursory explanation of coin
flipping outcomes, and the Chi-square table. Then it was quickly on to
two-dice outcomes, and one was to notice the triangular peak as those
outcomes accumulated. By the end of the second week (as I recall) we were
introduced to the wonders of the normal curve and the central limit theory
as the acme of statistical sophistication. Beyond that we were presented
normal curve method variations such as Student's T.

Since those days (1958), everything I have read indicates that one seeks to
measure the truly important issues with normal curve, or as normal a curve
which is possible in the circumstances, methods as first priority. Failing
that, one looks to refined nonparametric methods like the K-S Test. If none
those methods suffice for whatever reason, you are left to the lowly
Chi-square method varieties.

For instance, in Pearson's goodness-of-fit Test (which appears to be
disappearing in the newer beginner texts), one degree of freedom (1 d.f.,
the minimum provided for) is considered to provide the crudest p-value
approximation result for any randomly variable frequency data. More degrees
of freedom are said to provide some increasingly refined p-value
approximation. And there is usually advice given that if one needs to use
more than 30 d.f., one should look to normal curve methods. Thus, the
statistical methods decision circle is complete. (There does not seem to be
provision for the possibility of an infinite decision loop.)

It is my experience that virtually all published statistically based studies
have one, or more, probabilities value(s) which usually represent the
primary evidence supporting whatever conclusion(s) each study makes. That
would seem to make the accuracy of these probabilities values(s) of prime
importance.

Image my surprise discovering the lowly 1 d.f. Chi-square approximation
provides exactly the same p value (and Z value) as the standard normal curve
approximation at every binomial point.

From 2 d.f. onward, there is no contest. The normal curve estimates
immediately become variably disassociated from their underlying
probabilities. Since the Chi-square distribution is the underlying
probabilities themself, appropriate Chi-square evaluation forever remains
connected to underlying probabilities. It is the most direct interface to
underlying probabilities currently existent.

I suggest the traditional evaluation decision loop is fundamentally sounder
with the priorities reversed. Only if Chi-square evaluation is not
appropriate to your study needs should you consider other evaluation
methods. Because in every instance, no matter how stylish or sophisticated
the others might seem, they will in all cases have probability values more
remote from provable underlying probability values than an appropriate
Chi-square evaluation. And, I suspect, in many past cases study
probabilities reported are in fact disassociated from actual underlying
probabilities. I attribute this mainly to being instructed inadequately in
the seemingly simple basics, and subsequent incomplete data evaluation
practice. The later is a subject in its own right for another day.

Douglas

Date Subject Author
7/11/12 Douglas 73
7/11/12 Herman Rubin
7/11/12 Bruce Weaver
7/11/12 Richard Ulrich
7/11/12 Douglas 73
7/11/12 Richard Ulrich
7/12/12 Douglas 73
7/18/12 Douglas 73
7/18/12 Richard Ulrich
7/18/12 Douglas 73
7/11/12 Gordon Sande
7/12/12 Bruce Weaver
7/12/12 Herman Rubin
7/11/12 Douglas 73
7/29/12 Luis A. Afonso