Discrete and Normal DistributionDate: 01/10/2001 at 23:54:54 From: David Jordan Subject: Birthday problem in a normal distribution The traditional "birthday problem" of finding the probability that two or more people share a birthday assumes a uniform probability. How would one solve the problem if the distribution were a normal one, as in trying to find the probability that any two years had the same number of inches of rainfall? Date: 01/11/2001 at 15:27:48 From: Doctor Schwa Subject: Re: Birthday problem in a normal distribution Hi David, If it's really a normal distribution, it's a CONTINUOUS variable, so the probability of getting the same number (say, the 14.3687294... inches of rain we had here last year) is zero. Of course, what you mean is "what do you do when the variable is discrete but approximately normal?" Unfortunately the normality doesn't help much. What you need to do is, for each DISCRETE value of X, compute P(X)^2 and add them up. That is, P(1)*P(1) is the probability that both years had 1 inch of rainfall, P(2)*P(2) for 2 inches, and so on. So adding them gives the probability that both years had the same. You need a DISCRETE probability distribution, not a continuous one, for this problem to make sense. - Doctor Schwa, The Math Forum http://mathforum.org/dr.math/ Date: 01/12/2001 at 05:57:16 From: Doctor Mitteldorf Subject: Re: Birthday problem in a Normal distribution David, Here are some more thoughts on the problem of two years with the same number of inches of rainfall. The obvious way to formulate a discrete-distribution problem with this continuous distribution is to ask for the probability that the rainfall for two years is within some tolerance. Let's assume that you want to look for pairs of years in which the rainfall is within 1" of being the same. We also assume that the number of inches of rain is normally distributed. (Footnote: normal distributions go from -infinity to +infinity; clearly, it's meaningless to talk about rainfall that is less than 0"; but often the left tail end of the normal distribution decreases fast enough that the probability is already very close to zero before the number of inches becomes negative. So I'll talk about integrals from -infinity to +infinity, and you'll know what I mean.) Let's review the shared birthday problem: For any two people, the probability that they DON'T share a birthday is 364/365. Add a third person, and the probability that he doesn't share a birthday with either of them is 363/365. Continuing in this way, you can see that with N people, the probability that no 2 of them share a birthday is given by: 364 * 363 * 362 * ... * (366-N) P(N) = ------------------------------- 365^(N-1) Returning to the rainfall problem: What is the probability that two years don't have the same number of inches? You can find this by integrating over the two Normal distributions, linking the integrals together to assure that they are at least 1" apart. Let's assume that year 2 has a greater rainfall than year 1. We'll write N(x) for the particular normal distribution that applies to this region, with the correct mean and standard deviation. Then the double integral is: inf inf INT { INT [N(x)N(y) dy] dx} -inf x+1 Take this double integral and multiply by 2 to correct for the assumption that year 2 has greater rainfall than year 1 (since it's just as likely that year 1 has the larger rainfall). This double integral can't be done analytically, but it's an easy task numerically. Now let's look at a third year. We'll assume that the amount of rain is ordered: year 3 > year 2 > year 1. Then the probability that all three years are different by more than an inch is a triple integral: inf inf inf INT ( INT { INT [N(x)N(y)N(z) dz] dy} dx) -inf x+1 y+1 Take this triple integral and multiply by 6 to correct for the assumption that the years are ordered year 3 > year 2 > year 1. There are 6 such orderings, all equally likely. Again, this triple integral can be evaluated numerically with no difficulty. And it is not difficult to generalize to the case of N years: we'll have N-1 nested integrals, and the result will be divided by N!. There is a practical problem, however, in the fact that large- dimensional integrals quickly become intractable. A 6-dimensional integral is a real challenge, and numerical evaluation of a 10-D integral is not conceivable. I am confident that there are methods to surmount this problem, which offer good approximations to these multiple integrals over the normal function, but for these you'll need a more experienced statistician than myself. - Doctor Mitteldorf, The Math Forum http://mathforum.org/dr.math/ |
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]
Ask Dr. Math^{TM}
© 1994- The Math Forum at NCTM. All rights reserved.
http://mathforum.org/dr.math/