Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



Which sample variance should I choose?
Posted:
Aug 31, 2011 1:06 AM


The population variance is given by:
?^2 = ?(x  µ)^2 / n
with µ = population mean, the summation being over all the x in the population.
(for brevity, I haven't attempted to show subscripti on the x).
If you don't have the entire population, you can estimate the variance with the sample variance:
s^2 = ?(x  m)^2 / n (Eq. 1)
where m = sample mean (usually written as x bar), and the sum is over all the x in the sample. A second estimator is:
s^2 = ?(x  m)^2 / (n1) (Eq. 2)
which some people prefer because it is unbiased (that is, the average of all the possible sample variances equals the true population variance if you use the (n1) version).
See also http://mathworld.wolfram.com/SampleVariance.html
I have a set of data with an (allegedly) known population mean µ but an unknown ?^2. I wish to estimate ?^2. Under what circumstances should I prefer Eq.1 over Eq.2?
Or should I ignore the sample mean altogether, and plug the known population mean into one of the two equations? I.e.:
s^2 = ?(x  µ)^2 / n (Eq. 3) s^2 = ?(x  µ)^2 / (n1) (Eq. 4)
Under what circumstances should I prefer each of these four estimators of ?^2 and what are the pros and cons of each?
Thanks in advance,
 Steven



