Combinations with Duplicate Objects
Date: 06/22/99 at 13:02:10 From: Michael Black Subject: Combinations: n_C_k when items are duplicated We all know the combination formula for choosing k objects from n items: n! n_C_k = ----------- k!(n - k)! Here is what's confounding our group: What if some of the n items are duplicated? For instance, 6_C_3 from this list (ABBCCC). We know the value should be lower than 6_C_3 = 20 (actually, it's 5), but can't figure out how to modify the formula correctly. Is it even appropriate to be using n_C_k for this problem? Please help! Thank you for your time.
Date: 06/22/99 at 16:55:28 From: Doctor Anthony Subject: Re: Combinations: n_C_k when items are duplicated This is a type of problem that is considerably more difficult than one in which all objects are different. We require two bits of theory that you may not have met before. The first concerns the number of ways of distributing objects into distinct cells, and the second is the inclusion-exclusion principle. Suppose we have an equation of the form x1 + x2 + x3 = 10 and we wish to find the number of integer solutions of this equation with each of x1, x2, x3 having values that can range from 0 to 10. We can model this problem by considering the number of ways that say 10 oranges could be distributed at random to 3 children (zero oranges being allowed to one or two of the children). We can illustrate the situation with * representing oranges and | | representing a container. Thus |*****|**|***| is a possible solution with x1 = 5, x2 = 2, x3 = 3 The number of solutions is the same as the number of ways we could arrange 10 *'s and 2 |'s in every possible way. 12! This is clearly seen to be ------ = C(12,10) 10! 2! The two |'s represent the containers, and the number of | in any such line-up is always 1 less than the number of containers. So with r oranges and k children the result would be (r+k-1)! ----------- = C(r+k-1,r) r! (k-1)! and similarly the number of solutions to the equation x1 + x2 + x3 + ..... + xk = r is simply C(r+k-1,r) So far so good, and this is okay provided there are no restrictions on the x1, x2, values except that they must be less than r. When there are restrictions on the individual x values then we must subtract from C(r+k-1,r) the number of solutions where these x values are exceeded. This is where the inclusion-exclusion principle comes in. Inclusion-Exclusion Principle ------------------------------ It is easy to illustrate the principle with two or three probabilities using a Venn diagram. For example with two overlapping probabilities P(A) and P(B), we have P(A or B) = P(A) + P(B) - P(A and B) The overlapping region is added with P(A) and added again with P(B), so to get it added once only we subtract it, giving rise to the expression shown above. With three overlapping probabilities P(A), P(B) and P(C) we get P(A or B or C) = P(A) + P(B) + P(C) - P(A and B) - P(B and C) - P(C and A) + P(A and B and C) now the overlap of all 3 probabilities is added 3 times by the first line of this equation, subtracted 3 times by the second line of the equation and so must be added again by the third line. The area corresponding to A and B BUT NOT C is added twice in the first line and subtracted once in the second line, so it will appear once as is required. Similarly for area B and C but not A, and the area corresponding to C and A but not B, each will eventually appear once only. With more probabilities the pattern continues and the proof by induction is not difficult but a little tedious. The name 'inclusion-exclusion' describes exactly what is happening. Regions are being added a number of times and subtracted a number of times in such a way that each region occurs exactly once. The same ideas apply when we are talking about the NUMBER of elements in a set. So n(A or B or C) = n(A) + n(B) + n(C) - n(A and B) - n(B and C) - n(C and A) + n(A and B and C) and finally the number such that none of A, or B or C occur is |~A and ~B and ~C| = |U| - n(A or B or C) (where |U| is the universal set) = |U| - SUM[n(A)] + SUM[n(A and B)] - n(A and B and C) I shall now work through an example in which both of the above ideas are put into action. Determine the number of combinations of 10 letters that can be formed from 3.A's, 4.B's, 5.C's If there are no restrictions on any of the letters, this is simply our 10 oranges and 3 children problem again, and we saw there that the number of possible solutions is C(12,10) = 66 This also corresponds to the universal set |U| Let P1 be the property that a 10-combination has more than 3 A's P2 " " " 4 B's P3 " " " 5 C's If Ai is the NUMBER of 10-combinations which have property Pi (i=1,2,3) Then |~A and ~B and ~C| = 66 - SUM[A1] + SUM[A1 and A2] - A1 and A2 and A3 The set A1 consists of all 10-combinations in which A occurs 4 or more times. If we take any one of these 10-combinations in A1 and remove 4 A's we are left with a 6-combination, so the number of unwanted 10-combinations in which P1 occurs is equal to the 6-combination without a limit on the numbers of letters. So |A1| = C(6+3-1,6) = C(8,6) = 28 and similarly |A2| = C(7,5) = 21 |A3| = C(6,4) = 15 The set A1 and A2 consists of all 10-combinationsin which A occurs at least 4 times and B occurs at least 5 times. If from any of these 10-combinations we remove 4 A's and 5 B's we are left with a 1-combination of the unrestricted set. So |A1 and A2| = C(1+3-1,1) = C(3,1) = 3 Similarly |A2 and A3| = 0 and |A3 and A1| = C(0+3-1,0) = C(2,0) = 1 also |A1 and A2 and A3| = 0 Putting all these results together we get |~A and ~B and ~C| = 66 - (28+21+15) + (3+0+1) - 0 = 6 So there are just 6 of the 10-combinations that can be chosen with the restrictions on the numbers of letters. This type of question is often very conveniently answered using generating functions. To answer the problem that I worked through at some length, the number of 10-combinations that can be made from 3 A's, 4 B's, 5 C's is found from the coefficient of x^10 in the expansion of: (1+x+x^2+x^3)(1+x+x^2+x^3+x^4)(1+x+x^2+x^3+x^4+x^5) With the aid of Mathcad or some such computer assistance we get the coefficient of x^10 is 6 in a few seconds. In effect the computer does the thinking for you. To get the coefficient of x^10 without computer assistance we can ease the algebra a little by expressing the various brackets as sums of geometric series. 1+x+x^2+x^3 = (1-x^4)/(1-x) 1+x+x^2+x^3+x^4 = (1-x^5)/(1-x) 1+x+x^2+x^3+x^4+x^5 = (1-x^6)/(1-x) If we multiply these all together we get (1-x^4)(1-x^5)(1-x^6)[1-x]^(-3) [1-x^4 -x^5 -x^6 +x^9 +x^10 +x^11 -x^15] [1+3x+6x^2+10x^3+15x^4+21x^5+28x^6+36x^7+45x^8+55x^9+66x^10+... Picking out the coefficient of x^10 we get 66-28-21-15+3+1 = 6 Finally to answer the question you asked about, we need the number of 3-combinations taken from 1 A, 2 B's, 3 C's. The generating function is (1+x)(1+x+x^2)(1+x+x^3) = 1 + 3x + 5x^2 + 6x^3 + 5x^4 + + 3x^5 + x^6 and since the coefficient of x^3 is 6 we know that there will be 6 combinations of 3 letters. This differs from the answer you quoted. The actual combinations are ABC, ABB, ACC, BBC, BCC, CCC - Doctor Anthony, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994-2013 The Math Forum