Date: Mar 23, 2011 5:16 PM
Author: Steven D'Aprano
Subject: Expectation of the variance
I'm trying to demonstrate numerically (rather than algebraically) that
the expectation of the sample variance is the population variance, but
it's not working for me.
Some quick(?) background... please correct me if I'm wrong about anything.
The variance of a population is:
?^2 = 1/n * ?(x-?)^2 over all x in the population
where ^2 means superscript 2 (i.e. squared). In case you can't read the
symbols, here it is again in ASCII-only text:
theta^2 = 1/n * SUM( (x-mu)^2 )
If you don't have the entire population as your data, you can estimate
the population variance by calculating a sample variance:
s'^2 = 1/n * ?(x-?)^2 over all x in the sample
where s' is being used instead of s subscript n.
This is unbiased, provided you know the population mean mu ?. Normally
you don't though, and you're reduced to estimating it from your sample:
s'^2 = 1/n * ?(x-m)^2
where m is being used as the symbol for sample mean x bar = ?x/n
Unfortunately this sample variance is biased, so the "unbiased sample
variance" is used instead:
s^2 = 1/(n-1) * ?(x-m)^2
What makes this unbiased is that the expected value of the sample
variances equals the true population variance. E.g. see
The algebra convinces me -- I'm sure it's correct. But I'd like an easy
example I can show people, but it's not working for me!
Let's start with a population of: [1, 2, 3, 4]. The true mean is 2.5 and
the true (population) variance is 1.25.
All possible samples for each sample size > 1, and their exact sample
n = 2
1,2 : 1/2
1,3 : 2
1,4 : 9/2
2,3 : 1/2
2,4 : 2
3,4 : 1/2
Expectation for n=2: 5/3
1,2,3 : 1
1,3,4 : 7/3
2,3,4 : 1
Expectation for n=3: 13/9
1,2,3,4 : 5/3
Expectation for n=4: 5/3
As you can see, none of the expectations for a particular sample size are
equal to the population variance. If I instead add up all ten possible
sample variances, and divide by ten, I get 1.6 which is still not equal
What am I misunderstanding?
Thanks in advance,