On Wed, 31 Aug 2011 06:40:22 -0700 (PDT), leoldv <firstname.lastname@example.org> wrote:
>On Aug 31, 1:06 am, Steven D'Aprano <steve >+comp.lang.pyt...@pearwood.info> wrote: >> The population variance is given by: >> >> ?^2 = ?(x - µ)^2 / n >> >> with µ = population mean, the summation being over all the x in the >> population. >> >> (for brevity, I haven't attempted to show subscript-i on the x). >> >> If you don't have the entire population, you can estimate the variance with >> the sample variance: >> >> s^2 = ?(x - m)^2 / n (Eq. 1) >> >> where m = sample mean (usually written as x bar), and the sum is over all >> the x in the sample. A second estimator is: >> >> s^2 = ?(x - m)^2 / (n-1) (Eq. 2) >> >> which some people prefer because it is unbiased (that is, the average of all >> the possible sample variances equals the true population variance if you >> use the (n-1) version). >> >> See alsohttp://mathworld.wolfram.com/SampleVariance.html >> >> I have a set of data with an (allegedly) known population mean µ but an >> unknown ?^2. I wish to estimate ?^2. Under what circumstances should I >> prefer Eq.1 over Eq.2? >> >> Or should I ignore the sample mean altogether, and plug the known population >> mean into one of the two equations? I.e.: >> >> s^2 = ?(x - µ)^2 / n (Eq. 3) >> s^2 = ?(x - µ)^2 / (n-1) (Eq. 4) >> >> Under what circumstances should I prefer each of these four estimators of >> ?^2 and what are the pros and cons of each? >> >> Thanks in advance, >> >> -- >> Steven > >If you know mu, the population mean, divide by n.
That's a pretty complete answer.
But if you are cynical about the pop. mean being the pop. mean, then I suppose you could offer two answers, one using the observed mean. If you want tests using ANOVA, you use an unbiased estimate of the variance.
- To a question posted later: No, I don't see any place or justification for using the population mean in a computation with (n-1) as denominator.