Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



Re: Theorical question about inference
Posted:
Jun 22, 2006 7:01 PM


crossposted to sci.stat.edu, for general interest.
On 20 Jun 2006 17:32:09 0700, "Eric" <eyergeau@hotmail.com> wrote:
> Hi all, > > I know it's a basic question, but I couldn't get an answer from the www > resources. > > Since inference is about estimating population parameters from a > sample, are correlation and khi2 inference techniques or not ?
It took me Googling to get it  khi2 is chisquared (or X^2).
I would say that inference, as I have used it in research projects, is more often about "testing" rather than about "estimation." We want to know whether there is evidence that some effect *exists*, even though our data may be too weak. Zero versus nonzero. If it is nonzero, the meanestimate is large enough to be interesting, given small samples.
IF the sample size is large, then it becomes important to consider whether the size of the effect is larger than trivial sources of causation not controlledfor, and large enough to be interesting. For large N, inference might be more a matter of estimation.
So, I am denying your assumption, "about estimating population parameters" as the root of inference.
The chisquared distribution is used for testing in various situations. It typically can be increased by increasing the N, such as: in the tests on contingency tables.  Thus, it is not a good measure of "effect size" except for comparing tables (or other circumstances) that are highly similar.
The Pearson correlation needs to be referred to a sample N to get a test, so it comes closer (than the X^2 does) to measuring an effect size of association. Any correlation is determined by the variables, *and* by the choice of sample with its variances on measures, but not by the N. Correlations *can* be compared in tests between two samples (for example), but it is generally preferable to test (for that example) the regression coefficients, using a test based on pooling the samples.
> > Since they're not estimating parameters, they shouldn't be considered > as "inferential" techniques ? > > If so, how could one name them ? Please enlighten this troubled soul...
Chisquared is a "test statistic". It does require reference to the "degrees of freedom", which is often fixed for a design.
Pearson's r is a teststatistic that requires the N. Since it needs N, it functions as an estimate of effect size for familiar situations. Correlations are used "familiarly" in testretest reliability, just to mention one circumstance.
Cohen's d is used as an estimate for power analyses for another sort of "familiar" situation, and it seems rather analogous to the r, except that there is a separate test (ttest).
The Odds Ratio, at the other extreme, is a more like a pure effectsize "estimate", since it requires not just the totalN but the marginalNs in order to generate a test. I started to say that it is a "good" effectsize estimate  it does emerge rather naturally from logistic and loglinear modeling, but I'm not sure what the standard for "good" ought to be.
 Rich Ulrich, wpilib@pitt.edu http://www.pitt.edu/~wpilib/index.html



