Difference of Two Proportions in Statistics
Date: 08/15/2005 at 10:02:38 From: Roberta Subject: Difference of two proportions In stats class we are studying the binomial distribution and we tackled the difference of two proportions. Why is it that we assume np > 5 and n(1-p) > 5? Where does the 5 come from?
Date: 08/17/2005 at 13:07:29 From: Doctor George Subject: Re: Difference of two proportions Hi Roberta, I think I understand your question. When np > 5 and n(1-p) > 5 it is common to use the normal distribution as an approximation to the binomial distribution. Doing this allows us to use the z-chart to find probabilities, and that is a convenient thing to do. So the question is why does np > 5 and n(1-p) > 5 make the approximation reasonable? I also have never seen the reason in writing, but after thinking about it for a while I can make some sense out of it. Remember that the binomial distribution converges to the normal distribution for large n. This is what makes the issue in your question even worth considering. The normal distribution has infinite tails, which the binomial distribution does not. In order for the approximation to make sense the binomial distribution needs to have fairly long tails. In other words, the distance from the mean to the each endpoint must be a sufficiently large number of standard deviations. I should mention that I use q = 1-p with the binomial distribution. Your textbook may not be using q in this way. Let's call L the distance from the mean to the left endpoint in standard deviations. Then: L = np / sqrt(npq) np = L^2 * q if np > 5 then L > sqrt(5/q) Since q cannot be greater than 1, L is always greater than 2.23. On a z-chart you can see that less than 1.3% of the area is to the left of -2.23, so the left tail of the normal distribution has very little area beyond the endpoint of the binomial distribution. For smaller values of q the area to the left of L is even less. So with np > 5 the left tail of the binomial distribution starts becoming long enough to look similar to the left tail of the normal distribution. Now let's call R the distance from the mean to the right endpoint in standard deviations. Then R = (n-np) / sqrt(npq) R = n(1-p) / sqrt(np(1-p)) n(1-p) = R^2 * p if n(1-p) > 5 then R > sqrt(5/p) Now apply the same logic to the right tail that we did to the left tail. Rules of thumb go wrong now and then, but we can see that this is a pretty good one. Does that make sense? Write again if you need more help. - Doctor George, The Math Forum http://mathforum.org/dr.math/
Date: 08/19/2005 at 06:09:21 From: Roberta Subject: Thank you (Difference of two proportions) Thank you soooooo much!!!! You've been a great help! This is the first time I actually "asked Dr. Math" and I didn't think I would get a reply but I got one--wow! It's really cool! Thank you once again and hope to get some help in the future! :)
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.