Associated Topics || Dr. Math Home || Search Dr. Math

### Collecting a Set of Coupons

```
Date: 02/03/2000 at 09:32:18
From: Amal
Subject: Probability

Suppose that there are N different types of coupons and each time one
obtains a coupon it is equally likely to be any one of the N types.

a) Find the expected number of different types of coupons that are
contained in a set of (n) coupons

b) Find the variance and the expected number of coupons one needs to
collect before obtaining a complete set of at least one of each type.

c) Find the variance of the random variables in both case (a) and (b).
```

```
Date: 02/04/2000 at 08:37:59
From: Doctor Anthony
Subject: Re: Probability

Hi Amal,

a) Expected Number of New Numbers in a Sample
-------------------------------------------
A simple model will be that of throwing a die 6 times (n = 6) and
finding the probabilities of N = 1, 2, ..., 6 different numbers.

This can further be modeled by thinking of distributing 6 balls into
N = 1, 2, 3, 4, 5 and 6 urns.

To deal with equi-probable outcomes we must use the T(n,m) function.

Suppose we have n numbered balls and m numbered cells, n > m. Then the
function T(n,m) is the number of ways we can distribute the balls into
the cells such that no cell is empty.

The required result is given by

n!
T(n,m) = SUM[-----------------]   ...........................(1)
n1!n2!n3!...n(m)!

where n1 = number of objects in cell(1), n2 = number of objects in
cell(2) etc.

The sum is taken over all solutions of:

n1 + n2 + n3 + ... + n(m) = n   [and n(i) NOT= 0]

With no cell empty expression (1) is denoted by T(n,m)

For example:

3!      3!
T(3,2) = ----- + ----- = 3 + 3 = 6
1! 2!   2! 1!

4!         4!         4!
T(4,3) = -------- + -------- + -------- = 12 + 12 + 12 = 36
1! 1! 2!   1! 2! 1!   1! 1! 2!

We have the following recurrence relation

T(n,m) = m[T(n-1,m-1) + T(n-1,m)]

For example:

T(5,3) = 3[T(4,2) + T(4,3)]

Think of this as the number of ways of distributing 5 objects into 3
cells so that no cell is empty.

If we have already distributed 4 objects, the 5th object can be placed
in one of the cells in 3 ways. Then there are 2 situations to
consider.

The chosen cell was empty and the other 4 had been distributed in
T(4,2) ways. (This is okay because we are concerned that there should
be no empty cell AFTER the 5th ball has been placed.)

The chosen cell was occupied and the other 4 must then have been
distributed in T(4,3) ways.

It follows that

T(5,3) = 3[T(4,2) + T(4,3)]

The formula for T(n,m) is also given by the summation

m
T(n,m) = SUM[(-1)^k*C(m,k)*(m-k)^n]
k=0

Example:

T(5,3) = 3^5 - 3(2^5) + 3(1^5) = 243 - 96 + 3 = 150

The proof of this form of the expression follows from the
inclusion-exclusion principle and I have shown it at the end so as not
to get in the way of our problem with the number of different numbers
showing after we throw the die 6 times.

We can work through the values T(6,1), T(6,2) ... T(6,6) fairly
quickly.

Having calculated T(6,r) we must also multiply by C(6,r) to take
account of the r cells from 6 that we can choose.

T(6,1) = C(1,0) 1^6
= 1

T(6,2) = C(2,0) 2^6 - C(2,1) 1^6
= 62

T(6,3) = C(3,0) 3^6 - C(3,1) 2^6 + C(3,2) 1^6
= 540

T(6,4) = C(4,0) 4^6 - C(4,1) 3^6 + C(4,2) 2^6 - C(4,3) 1^6
= 1560

T(6,5) = C(5,0) 5^6 - C(5,1) 4^6 + C(5,2) 3^6 - C(5,3) 2^6
+ C(5,4) 1^6
=  1800

T(6,6) = C(6,0) 6^6 - C(6,1) 5^6 + C(6,2) 4^6 - C(6,3) 3^6
+ C(6,4) 2^6 - C(6,5) 1^6
= 720

So we have the following table:

| T(6,r)   C(6,r)   T(6,r) x C(6,r)
------|----------------------------------
r = 1 |     1        6            6
r = 2 |    62       15          930
r = 3 |   540       20        10800
r = 4 |  1560       15        23400
r = 5 |  1800        6        10800
r = 6 |   720        1          720
-------|----------------------------------
Total = 46656

and 6^6 = 46656 so this checks the arithmetic.

The probabilities are then each of the numbers divided by 6^6.

The expected number of new numbers is then:

1*6 + 2*930 + 3*10800 + 4*23400 + 5*10800 + 6*720
-------------------------------------------------
46656

= 3.9906

The calculation of E(x^2) is then

1^2*6 + 2^2*930 + 3^2*10800 + 4^2*23400 + 5^2*10800 +6^2*720
------------------------------------------------------------
46656

= 16.5304

And Var(x) = 16.5304 - 3.9906^2

=  0.60559

Formula for T(n,m) using Inclusion-Exclusion
--------------------------------------------
Suppose we let |A| be the number of ways that cell(1) is empty, |B| is
the number of ways that cell(2) is empty and |C| is the number of ways
that cell(3) is empty. |U| is the universal set and is the total
number of ways of distributing 5 different objects into 3 cells, =
3^5.

If

S(0) = |U|
S(1) = |A| + |B| + |C|
S(2) = |AB| + |BC| + |CA|
s(3) = |ABC|

Then

|A'B'|   = S(0) - S(1) + S(2)
|A'B'C'| = S(0) - S(1) + S(2) - S(3)

which means that none of cell(1), cell(2) or cell(3) is empty, and
clearly this idea can be extended to any number of cells.

The general formula is:

m
|A'B'C'D'E'...M'| = SUM[(-1)^k*S(k)]
k=0

Which is the formula for T(n,m) quoted above.

Example:

Find T(5,3)

We have

S(0) = 3^5             = 243
S(1) = 2^5 + 2^5 + 2^5 =  96
S(2) = 1 + 1 + 1       =   3
S(3) = 0               =   0

and applying the formula for inclusion exclusion we get

|A'B'C'| = 243 - 96 + 3 - 0 = 150

as we found before.

The general formula to use for n objects and m cells is:

m
T(n,m) = SUM[(-1)^k*C(m,k)*(m-k)^n]
k=0

b) Find the variance and the expected number of coupons one needs to
collect before obtaining a complete set of at least one of each type.

Here you will want to refer to my answer to "The Probable Pen in the
Cereal Box" in the Dr. Math archives:

http://mathforum.org/dr.math/problems/kuropatwa.11.25.99.html

c) Find the variance of the random variables in both case (a) and (b)

To find the variance you would need to find the value of E(x^2) for
each new number, e.g.

E(x^2) = 1^2(5/6) + 2^2(1/6)(5/6) + 3^2(1/6)^2*(5/6) + ...

= (5/6)[1 + 4(1/6) + 9(1/6)^2 + 16(1/6)^3 + ...]

The method for finding the sum of such a series is as follows:

S = 1 + 2^2.x + 3^2.x^2 + 4^2.x^3 + ...

INT[S.dx} = x + 2x^2 + 3x^3 + 4x^4 + ...

= x[1 + 2x + 3x^2 + 4x^3 + ...]

= x/(1-x)^2     from earlier part of the question

and so to find the series S we differentiate this expression:

(1-x)^2 + 2x(1-x)   1-x + 2x     1+x
S = ----------------- = -------- = -------
(1-x)^4        (1-x)^3    (1-x)^3

putting x = 1/6 we get

S = (7/6)/(5/6)^3
and
E(x^2) = (5/6)(7/6)/(5/6)^3

= (7/6)/(5/6)^2

= 42/5
and
var(x^2) = 42/5 - (6/5)^2

= 174/25

and similar calculations for other variances.

- Doctor Anthony, The Math Forum
http://mathforum.org/dr.math/
```
Associated Topics:
High School Probability

Search the Dr. Math Library:

 Find items containing (put spaces between keywords):   Click only once for faster results: [ Choose "whole words" when searching for a word like age.] all keywords, in any order at least one, that exact phrase parts of words whole words

Submit your own question to Dr. Math
Math Forum Home || Math Library || Quick Reference || Math Forum Search