|
|
Re: Sampling From Finite Population with Replacement
Posted:
Sep 28, 2010 10:31 AM
|
|
On Sep 28, 7:50 am, Cagdas Ozgenc <cagdas.ozg...@gmail.com> wrote: > On Sep 26, 3:25 am, Rich Ulrich <rich.ulr...@comcast.net> wrote: > > > > > On Fri, 24 Sep 2010 03:20:50 -0700 (PDT), Cagdas Ozgenc > > > <cagdas.ozg...@gmail.com> wrote: > > >In statistics text books it is proposed that sampling from a finite > > >population with replacement is equivalent to sampling from an infinite > > >population. I find this somewhat misleading. > > > >Suppose that we have a population of size N generated by random > > >variable Normal(MeanM, StdDevM). Then take samples of size n < N from > > >this population and calculate average (let's call it MeanS). > > > >MeanS = (1/n)*sum of samples > > > >There is no way you can estimate MeanM in an unbiased fashion. > > > Where do you see "bias"? I think you need to check on that word. > > > > You can > > >only estimate population mean (let's call it MeanP) which is not > > >equal to MeanM, the mean of random variable that generated the > > >population. > > > This population mean is the best "unbiased estimate" of the > > generating mean that you can have here. > > > Where do you get the notion that an unbiased estimatore > > has zero error? It is supposed to be zero "on the average". > > > It is convenient for us that in many cases, the easiest unbiased > > estimate of something in particular is smaller than any of > > the biased estimates, as well as being generally convenient. > > > On the other hand, you can divide either by N, (N-1) or > > (N+1) to get three different estimates of the variance > > the normal, each of which has its uses. (N-1) gives > > unbiased. I think it is (N+1) that gives minimum variance > > for the estimate. > > > >Is my thinking flawed? Or do we always infer about an hypothetical > > >infinite population? > > > If we are doing experimental science that is intended as > > inferential, there is a future that we point to. For those > > cases, there is an infinite population. That's the only case > > that most of us ever need to worry about. > > > When we are predicting the final election returns from > > the 10 p.m. returns that include 50% of the precincts, > > and using previously known patterns, the N is not infinite. > > > -- > > Rich Ulrich > > Here is what I am trying to say. Take any statistics book you will > find a statment that starts something like the following: > > "You have a population of size N with elements normally distributed > with Mu and Sigma. If we sample from this population with > replacement..." Then they continue calculating population mean and > variance, and then claim that Expected Value of Sample mean is equal > to Mu. > > The point I read that it gives me the creeps. First of all normal > distribution is a model. Yes sample mean will give an unbiased > estimate of the population mean (which is a population parameter not a > model parameter). But on average it will not be Mu. Sampling with > replacement from a finite population will not give an unbiased > estimation of the model paramaters. Either I am not reading my books > carefully or this issue is somehow swept under the rug. > > In the infinite population case my understanding is that population > parameter and model paratemer will converge. But when we talk about > inference do we ever care about the model parameter?
I don't understand your objection. Have you ever tried it with a population small enough so that you can enumerate all possible samples of a given size? E.g., try the following:
1. Let the population consist of 5 scores: 2, 3, 4, 5, 6
2. Compute the population mean and SD (with N, not n-1 in the denominator).
3. Draw all possible samples of n=2 (with replacement) from the population--there are 25 of them. For each one, compute the sample mean.
4. Compute the mean and SD of the 25 sample means. For the SD, use N=25 in the denominator, because you have the entire population of sample means.
Notice that the mean of the sample means = the population mean; and the SD of the sample means = the population SD over the square root of the sample size.
-- Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/Home "When all else fails, RTFM."
|
|