On Aug 31, 1:06 am, Steven D'Aprano <steve +comp.lang.pyt...@pearwood.info> wrote: > The population variance is given by: > > ?^2 = ?(x - µ)^2 / n > > with µ = population mean, the summation being over all the x in the > population. > > (for brevity, I haven't attempted to show subscript-i on the x). > > If you don't have the entire population, you can estimate the variance with > the sample variance: > > s^2 = ?(x - m)^2 / n (Eq. 1) > > where m = sample mean (usually written as x bar), and the sum is over all > the x in the sample. A second estimator is: > > s^2 = ?(x - m)^2 / (n-1) (Eq. 2) > > which some people prefer because it is unbiased (that is, the average of all > the possible sample variances equals the true population variance if you > use the (n-1) version). > > See alsohttp://mathworld.wolfram.com/SampleVariance.html > > I have a set of data with an (allegedly) known population mean µ but an > unknown ?^2. I wish to estimate ?^2. Under what circumstances should I > prefer Eq.1 over Eq.2? > > Or should I ignore the sample mean altogether, and plug the known population > mean into one of the two equations? I.e.: > > s^2 = ?(x - µ)^2 / n (Eq. 3) > s^2 = ?(x - µ)^2 / (n-1) (Eq. 4) > > Under what circumstances should I prefer each of these four estimators of > ?^2 and what are the pros and cons of each?
There is no answer until you tell us what you are planning to use the variance for.
-- Paige Miller paige\dot\miller \at\ kodak\dot\com