Central Limit Theorem
Date: 03/08/2002 at 11:04:34 From: Stephanie Sparks Subject: Probability of "at least" Here's the problem: The probability that a drug is effective on any one patient is 62%. Find the probability that, of the next 200 patients, at least half (or more) will survive. I've figured that the probability of the patient dying is 38%, and out of the next 200 patients, we want to know if at least 100 or more people will survive, exactly 100 or more successes in 200 trials, but I don't know what to do from there. Thanks, Stephanie
Date: 03/08/2002 at 17:47:27 From: Doctor Jubal Subject: Re: Probability of "at least" Hi Stephanie, Thanks for writing Dr. Math. The most straightforward way to do this, and one that would take a lot of time, is to recognize that the probability of achieving k successes in N trials has a binomial distribution, which is N! P(k) = p^k * (1-p)^(N-k) * ---------- k!(N-k)! where p is the probability of success on a single trial. You could use this to calculate P(100), P(101), P(102), and so on to P(200), and then add all these probabilities up, and you'd have the exact answer. It would also take you far more time than you probably want to devote to this problem. One of the most useful theorems of statistics is the Central Limit Theorem, which says that if you take a large number of independent random variables and add them together, no matter what the form of the probability distributions of the individual variables is, the sum will have a distribution that is approximately Gaussian. This is nice because it lets us approximate a binomial distribution (with a "large enough" value of N) as a Gaussian one, because the binomial distriubtion itself is really the sum of several Bernoulli distributions (each trial has a Bernoulli distribution), so as the number of trials gets large, the binomial distribution becomes more or less Gaussian. Because the Gaussian distribution is continuous (whereas the binomial distribution is discrete), it lets us do an integral instead of the sum P(100) + P(101) + P(102) + ... + P(200). Best of all, the values of the integral we need to do have already been calculated and put in the back of almost any mathematical handbook or statistics text. The Gaussian distribution that best approximates a binomial distribution has mean Np and variance Np(1-p), which makes a lot of sense because Np and Np(1-p) are the mean and variance of the binomial distribution itself. So, since 62% of the patients survive, the mean number of survivors in 200 patients is (0.62)(200) = 124. Also, we can expect a variance in this value of (0.62)(0.38)(200) = 47.12, for a standard deviation of about 6.86. As you said, we want to know what the probability of at least 100 patients surviving is, or rather what the probability of doing no worse than 24 more deaths than the mean. 24 is (24)/(6.86) = 3.5 times more than the standard deviation. Now, for a Gaussian distribution, what is the probability of falling no more than 3.5 standard deviations below the mean? Does this help? Write back if you'd like to talk about this some more, or if you have any other questions. - Doctor Jubal, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.