Probability of Random Samples by Complementarity, Inclusion, and ExclusionDate: 08/09/2012 at 08:09:12 From: Abe Subject: Probability of drawing random sample containing ... Suppose you have a set S with N >= 2 different elements: S = {A_1, A_2, A_3, ..., A_N}. From this set, you draw a random sample of size k, with replacement. What is the probability that you draw a sample which at least contains A_1 and A_2? Other elements may also be in the sample, but these two should at least be in there. I distinguish two cases: Case (1) Only A_1 and A_2 occur in the sample. In this case all the samples should be either A_1 or A_2, and each of these elements should occur at least once. We therefore have to "distribute" the samples over two bins -- A_1 and A_2 -- such that each bin is non-empty. This requires Stirling numbers of the second kind. Since the labels of the bins can also be switched, we have to multiply by 2! to find the number of possible ways you can obtain a sample containing only these two. This number is given by ... S(k, 2) * 2! ... where S(k, 2) are Stirling numbers of the second kind. Case (2) Now I would tackle it the same way but with 3 bins: 1. A_1 2. A_2 3. other elements Again, Stirling numbers can be used: S(k, 3). From here on, I am stuck, as I am not able to figure out what to do with the "other elements" bin. Once you find the number for case (2), you only need to add the number of case (1) and divide by the total number of possibilities (N^k). But how do you calculate the probability if your sample does not contain just A_1 and A_2? Any help would be very much appreciated! Date: 08/09/2012 at 12:12:59 From: Doctor Schwa Subject: Re: Probability of drawing random sample containing ... Hi Abe, When you get stuck in some complicated casework like this, it's always a good idea to look at the idea of complementary probability. "Contains one or more A_1 and one or more A_2" gets tricky because of the "or more" part. "Contains no A_1, or contains no A_2" is a lot simpler, because the probability of containing none is much more straightforward. Is that enough of a hint to get you started? There's still a little bit of complication here that you'll have to deal with, because of potentially overcounting the things that contain no A_1 and also no A_2. Feel free to write back if you still have questions! - Doctor Schwa, The Math Forum http://mathforum.org/dr.math/ Date: 08/09/2012 at 18:18:40 From: Abe Subject: Probability of drawing random sample containing ... Dear Doctor Schwa, Thank you for this valuable suggestion. In fact, it was also the first thing that came to my mind! In the case of a regular "at least" type of question, this would be easy: say you want to know the probability that you draw A_1 at least once. This would be equal to 1 - probability of not drawing A_1 at all (which is the complement) This is equal to 1 - (1 - (1/N))^k. However, when you have 2 "at least" conditions, the probability of the complement is not so easy to calculate. In my case, the complement of drawing at least one A_1 AND at least one A_2, is given by drawing (a) neither A_1 nor A_2 OR (b) at least one A_1 but not A_2 OR (c) at least one A_2 but not A_1. The probability of drawing neither A_1 or A_2 is again easy and is given by 1 - (1 - (2/N))^k. However, the other two probabilities (which should be the same), I cannot find. How do you find the probability of drawing at least one A_1 but not A_2? I hope I did not misinterpret your explanation or make any errors in mine. Kind regards, Abe Date: 08/09/2012 at 22:28:32 From: Doctor Schwa Subject: Re: Probability of drawing random sample containing ... Hi Abe, Sounds like you're on the right track here! One technique here is to find the probability you describe as (b) by finding the probability of at least one A_1 and subtracting the probability of (at least one A_1 and at least one A_2), which you conveniently found in part (a). Once you do that, you'll see the Venn diagram shortcut -- also known as the principle of inclusion and exclusion -- when you generalize it to more than two sets. Your final answer is (probability of at least one A_1) + (probability of at least one A_2) - (probability of both) - Doctor Schwa, The Math Forum http://mathforum.org/dr.math/ Date: 08/10/2012 at 03:57:17 From: Abe Subject: Probability of drawing random sample containing ... Dear Doctor Schwa, I'm afraid there might be a mix up: Remember that I'm searching for the probability of at least one A_1 and at least one A_2. What I found in (a) was the probability of no A_1 and no A_2. Consequently, I cannot use it here to calculate (b), so it looks like I'm back to square one. There does not appear to be an easy way out. I guess I will need the Stirling numbers (second kind) after all. Kind regards, Abe Date: 08/10/2012 at 04:43:48 From: Doctor Schwa Subject: Re: Probability of drawing random sample containing ... Hi Abe, You can calculate P(no A_1) and P(no A_2). You can also calculate P(no A_1 and no A_2). My claim was that P(at least one A_1 and at least one A_2) = 1 - P(no A_1 and/or no A_2) = 1 - [P(no A_1) + P(no A_2) - P(no A_1 and no A_2)] Here, the subtraction accounts for overcounting. Am I making any sense? - Doctor Schwa, The Math Forum http://mathforum.org/dr.math/ Date: 08/10/2012 at 05:59:08 From: Abe Subject: Thank you (Probability of drawing random sample containing ...) Dear Doctor Schwa, You are making complete sense! This was indeed what I was looking for. The probability is therefore given by: 1 - 2(1 - (1/N))^k + (1 - (2/N))^k Thank you very much for helping me out! |
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]
Ask Dr. Math^{TM}
© 1994-2015 The Math Forum
http://mathforum.org/dr.math/