Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Sampling From Finite Population with Replacement
Replies: 28   Last Post: Sep 30, 2010 6:30 AM

 Messages: [ Previous | Next ]
 cagdas.ozgenc@gmail.com Posts: 58 Registered: 3/29/06
Re: Sampling From Finite Population with Replacement
Posted: Sep 28, 2010 11:49 PM

On 29 Eylül, 01:49, Bruce Weaver <bwea...@lakeheadu.ca> wrote:
> On Sep 28, 2:48 pm, Cagdas Ozgenc <cagdas.ozg...@gmail.com> wrote:
>
>
>
>
>
>
>

> > Let me try to explain one more time. My questions are usually too deep
> > down there for me to explain properly.

>
> > First of all I am talking about 3 diferent things here: model
> > parameters, population parameters, sample statistics

>
> > I have no objection to the fact that population mean can be calculated
> > by sample mean in an unbiased way. However you will find commonly in
> > text books and real life research that what's trying to be inferred is
> > not the population mean but the model mean (or generating process).

>
> > For example take a look at the lecture notes of a stats class in
> > UCDavis that I just found on the internet (page 5):

>
> >http://www.stat.ucdavis.edu/~jie/stat13.winter2010/lec20.pdf
>
> > It is trying to show the difference between sampling with replacement
> > vs sampling without replacement. But that's not the issue here.
> > There's something else wrong about it.

>
> > Starts with the following:
>
> > "Suppose the heights of female students entering UC Davis in year 2005
> > follows a normal distribution, with mean mu and standard deviation
> > sigma"

>
> > First of all number of female students entering UC Davis in year 2005
> > is finite. If this is really our population then there is no way it
> > can be normally distributed.

>
> I agree. But nothing else is truly normal either, at least if you're
> working with real (rather than simulated) data. George Box summed it
> up pretty nicely as follows:
>
> "...the statistician knows...that in nature there never was a normal
> distribution, there never was a straight line, yet with normal and
> linear assumptions, known to be false, he can often derive results
> which match, to a useful approximation, those found in the real
> world." (JASA, 1976, Vol. 71, 791-799)
>

> > Normal distribution is a model for
> > infinite populations.

>
> This strikes me as too restrictive. I think the normal distribution
> can also serve as a fairly decent model for finite populations,
> provided they are large enough. According to the website given below,
> UC Davis had a little over 5,000 freshman students in 2005. If even
> half were female, the normal distribution described might not be too
> bad as a *model* for the height distribution of incoming female
> students.
>
>
>
>
>
>

> > This means that actually we are not looking at a
> > population but we are looking at a sampling from normal distribution.

>
> > Now the rest of the problem:
>
> > "A random sample of 100 students are taken" from the above so called
> > population.

>
> > Now we are looking at a sample of a sample. This means that no matter
> > what you do, you will never find the mean of the normal distribution
> > (Mu) by repeated sampling. It doesn't matter whether you do it with
> > replacement or without replacement.

>
> > You will end up calculating the mean of the population, which will be
> > slightly or significantly different from Mu depending on how many
> > students entered UC Davis in year 2005. This means that our samples of
> > 100 students will be an unbiased estimate of the population mean but a
> > biased estimate of Mu.

>
> > This is the difference between sampling with replacement from a finite
> > population and sampling from an infinite population. It seems to me
> > that this is a chronical problem in stat texts.

>
> > I understand that when you have 1000 elements in your population the
> > difference in the result will be miniscule. Or some could say that it
> > is just a model and a model is not 100% reflection of real life
> > (that's why it is called a model).

>
> Or as Box said,
>
> "All models are wrong. Some are useful."
>

> > However, members of a finite
> > population can well be generated by a normal random variable, there is
> > nothing wrong with that. The problem arises when we start calling this
> > a population and start calculating mean, variance, and confidence
> > intervals. Here we were trying to capture the essence of the
> > generating stochastic process (mu, sigma), but we actually ended up
> > with something else.

>
> > The put the final nail in the coffin, let's look at the last sentence
> > on that page of lecture notes:

>
> > "Then X1,...,X100 are i.i.d. N(mu, sigma) random variables". This is
> > just NOT TRUE!

>
> No, it's not. But I think the question is whether the approximation
> is close enough to be useful under the circumstances.
>
> HTH.
> --
> Bruce Weaver
> "When all else fails, RTFM."- Al?nt?y? gizle -
>
> - Al?nt?y? göster -- Al?nt?y? gizle -
>
> - Al?nt?y? göster -

Of course normal is a good model for finite populations as well. I
agree with you on that.

The point I was trying to make is that once you get into using a
model, this can be any probability distribution with infinitely many
values (normal, uniform, or even a discrete distribution with infinite
variety in values), or density estimation for example there is indeed
a difference between sampling from an infinite population and sampling
from a finite population with replacement.