The Birthday Problem

From Math Images

Jump to: navigation, search
Image:inprogress.png

The Birthday Problem

How many people do you need in a room before it is more than likely that at least two of them have the same birthday? This question is the original Birthday Problem, a common probability problem that stumps many people. In 1970, Johnny Carson tried, and failed, to solve the birthday problem on The Tonight Show.

This page investigates (and solves!) the birthday problem, in addition to some similar puzzles.


Contents

Basic Description

When asked to solve the birthday problem, many people make initial guesses that are much higher than the actual solution. A common answer to the birthday problem is 183 people, which is simply 365 (the number of days in a year), divided by 2, rounded up to the nearest whole number. However, we will later show that the actual solution is a much smaller number.


Figure 1. Roy Murphy's graph of predicted v. actual birthdays
Figure 1. Roy Murphy's graph of predicted v. actual birthdays

When solving this problem we must make two assumptions:

  1. There are always 365 days in a year (disregard leap years)
  2. All birthdays are equally likely to occur

Wait! What if all birthdays are not equally likely to occur? Surely some birthday months must be more common than others.

Roy Murphy ran an analysis of the distribution of birthdays by collecting data from 480,040 insurance policy applications made between 1981 and 1994.His results are shown to the right in figure 1. [1] *****Roy Murphy has not responded to my e-mail
http://www.panix.com/~murphy/bday.html

Is this problem still solvable even though birthdays are not uniformly distributed?
In their study Majorization and the Birthday Inequality, M. Lawrence Clevenson and William Watkins state, "If the distribution of birthday is not uniform, then the probability of a match is at least as large as it is when the birthdays are uniformly distributed throughout the year. We call this fact the 'birthday inequality." If we agree with Clevenson and Watkind, assuming that all birthdays are equally likely will give us accurate results. For more on the birthday inequality and to read the full article, click here!

Combinations v. Permutations

It is also important to understand the difference between combinations and permutations.

Combinations

How many ways can you select k items from a group of n items? When selecting a combination of items, order does not matter. For example, if you pulling three marbles out of a bag that contains 10 marbles, 3 blue, 3 red, and 4 green, the order in which you choose the marbles is not important. Pulling a red, red, green is the same as pulling a red, green, red.
The formula for a combination is as follows
 \binom nk = \frac{n!}{k!(n-k)!} whenever k\leq n, and which is zero when k>n.[2]

Permutations


When order matters, the combination becomes a permutation. For example, suppose 5 students run a race. How many ways can the 5 students finish 1st, 2nd and 3rd? In this case, result Jenna, Mike, Emily is different from result Mike, Emily Jenna.
Because we do not have to consider identical results (the outcome of a red, blue and green marble v. the outcome of a red, green, and blue marble), the formula for a permutation is simpler:
 \binom nk =\frac{n!}{(n-k)!}


A More Mathematical Explanation

Note: understanding of this explanation requires: *Algebra, Probablity

In probability, the term more than likely means that the probability of the event happening is greater than 1/2. So we can say,

P_n \geq \frac{1}{2}

If the initial question is, "In a room with n people, what is the probability that at least 2 people share the same birthday?, then we set up the probability like this:

First, we will make the initial problem simpler by choosing a set n value. Say n=10. This probability problem is actually quite easy to solve if we first solve the complementary probability. The complementary probability is the probability that the original probability does not happen. For example, if the original probability is the chance that we get tails when we throw a coin the complementary probability is the probability that we do not get tails. The complementary probability is always equal to 1 - original probability.


P_{10}=the probability that, in a room of 10 people, at least two people share the same birthday

The complementary probability to the birthday problem investigates the question, "In a room with 10 people, what is the probability that no two people share the same birthday?"We can set up this probability like this:

P'_{10}=the probability that, in a room of 10 people, NO two people share the same birthday

To solve the complementary probability (P'_{10}), we will first assume that n=10

How many birthday outcomes are possible if there are 10 people in the room?
C_{10} = 365\times 365 \times 365 \times 365 \times 365 \times 365 \times 365 \times 365 \times 365 \times 365 = 365^{10}
Note: the general solution to this problem is just C_{n} = 365^{n}
How many birthday combinations are possible if there are 10 people in the room AND no one has the same birthday?
C'_{10} = 365\times 364 \times 363 \times 362 \times 361 \times 360 \times 359 \times 358 \times 357 \times 356 = 3.706 \times 10^{25}

So, The probability that no two people have the same birthday:

P'_{10} =\frac{\text{outcomes where no one shares a birthday}}{\text{all possible outcomes}}=\frac{3.706 \times 10^{25}}{365^{10}}=0.883

Now that we have calculated the complementary probability, it is quite easy to calculate the actual probability. Recall that:

P_{10} = 1 - P'_{10}, so
P_{10} = 1 - 0.883 = 0.117

With ten people in the room the probability that at least two people in the room share the same birthday is 0.117.

If we test out other values for n, we end up with this table:

Image:Bettergraphtable1.jpg

We see that 23 is the first whole number where P_n \geq \frac{1}{2}


Another Similar Problem

Many people who are shocked by the answer to the birthday problem were actually thinking about this question: In a room with n people, what is the probability that someone in the room shares my birthday?

This problem is also easy to solve if we think about the complementary probability. Say,

P_n= the probability that, in a room with n people, someone shares your birthday, and the complementary probability
P'_n= the probability that, in a room with n people, no one shares your birthday


Once again, we want to first calculate the complementary probability. say n=23

P'_{23} =\frac{\text{possible outcomes where no one shares your birthday}}{\text{all possible birthday outcomes}}
Since it is still possible for two other people in the room to share the same birthday, the numerator in this problem is constant (unlike the first example). In this case,
P'_{23}= \frac{364}{365} \times \frac{364}{365} \times \frac{364}{365} \times \frac{364}{365} \text{...}= \left(\frac{364}{365}\right)^{23}= 0.9388

Note: the general solution to this problem is just P'_n=\left(\frac{364}{365}\right)^n

So the probability that, in a room of 23, someone shares your birthday:

P_{23} = 1 - P'_{23} = 1 - 0.9388 =0.0612

In a room of 23, the probability that someone shares your birthday is quite low. We can solve for n to see how many people need to be in the room before it is more than likely that someone shares your birthday.

Recall that in probability, the term more than likely means that the probability of the event happening is greater than 1/2. So we can say,

P_n \geq\frac{1}{2}
If P_n = 1 - P'_n , then P'_n \leq \frac{1}{2}
P'_n= \frac{364}{365} \times \frac{364}{365} \times \frac{364}{365} \times \frac{364}{365} \text{...}= \left(\frac{364}{365}\right)^{n}, so
\left(\frac{364}{365}\right)^{n} \leq \frac{1}{2} , and take the  \ln of both sides so
 n \times \ln \frac{364}{365} \leq \ln \frac{1}{2}
Since  \ln \frac{364}{365} is negative, we must switch the sign when we divide:
n \geq \frac{\ln \frac{1}{2}}{\ln \frac{364}{365}}
 n = \frac{\ln \frac{1}{2}}{\ln \frac{364}{365}}= 252.651989 \approx 253 people
There needs to be 253 people in the room before it is more than likely that someone in the room shares your birthday.


The Almost Birthday Problem

What is the probability that, in a room with n people, there are two people who have birthdays within r days of each other?

The formula to find the probability that, in a room with n people, there are two people who have birthdays within r days of each other is

 P_n(r)= 1 - \frac{(365-1-nr)!}{365^{n-1}(365-(r+1)n)!}

The proof for this formula is more difficult. In his book, Understanding Probability: Chance Rules in Everyday Life, author Henk Tijms directs his readers to P. Diaconis and F. Msteller's Methods for Studying Coincidences for a detailed explanation. To learn more, click here!


Cool Applications

Below are some links to interactive programs involving the birthday problem. You will need to have the Wolfram Alpha Demonstration Project installed on your computer to view these applications!


Why It's Interesting

The Birthday Problem is one of many problems in probability that demonstrates that certain "rare" events really are not all that rare. People are often fascinated by certain events in nature, casinos, and sporting events. When you can understand the math behind certain scenarios like the birthday problem and The Monty Hall Problem you can better understand the events that occur around you on a daily basis.


Teaching Materials

There are currently no teaching materials for this page. Add teaching materials.




References

  1. [ http://www.panix.com/~murphy/bday.html "An Analysis of the Distribution of Birthdays in a Calendar Year"], Retrieved on 16 July 2012.
  2. [ http://en.wikipedia.org/wiki/Combination "Combination"], Retrieved on 16 July 2012.

Future Directions for this Page

Finishing this Section:

More than 2 people

What is the probability that at least 3 people in the room are born on the same day?

What is the probability that, in a room of n people, at least three people share a birthday? We are going to call this Sn where n is the number of people in the room.

Let's first assume that n=3 Let

S_3= Probability that, in a room with 3 people, at least 3 people share the same birthday. In this specific case, it is also the probability that everyone in the room shares the same birthday.
S'_3= probability that, in a room with 3 people, no three people share a birthday

To start, we must consider all possibilities.

1. That no two people are born on the same day. Let’s call this “1 1 1”
2. That two people share a birthday. Let’s call this “2 1”
3. That all three people share a birthday. Let’s call this “3”

For these probabilities we are going to use the notation P(n,k) where n is the number of people in the room and k is the number of pairs of people who share a birthday. Let’s calculate the probability of possibility 3 (which is equal to P (3,3)). This is simply

 \frac{365}{365} \times \frac{1}{365} \times \frac{1}{365} = \frac{1}{365 \times 365} =  0.0000075 So the probability that, in a room with 3 people, 3 people share the same birthday is practically zero.


General Formula?

In number theory, a partition of n is a way of writing n as a sum of positive integers. Order is not important in partitions. For example: 2 1 1 is considered the same as 1 2 1. A composition is a partition in which the order is important. [3]

When we calculated P (3,3) above, we were looking at partitions. To find any probability, we must sum the probabilities of the relevant partitions. To solve this problem (that, in a room with n people, at least three people share the same birthday), any partition that involves integers greater than or equal to 3 would be summed to get the direct probability, P(3,3). All other partitions (those only involving integers 1 and 2) are summed to get the complementary probability, P'(3,3) the probability that, in a room with n people, no 3 people share a birthday

If we look at the partitions for n=3 in the chart above we see that there is 1 partition that corresponds to the direct probability (in red), and 2 partitions that correspond to the indirect probability (blue). For n=3 it is easier to just find the probability of the 1 partition. However, the chart shows that, as n increases, the number of direct partitions (red) increases much faster than the number of indirect partitions. This means that computing the direct probability will be much more time consuming for greater n values. For this reason, we will compute ‘’n’’ values greater than 3 by using the complementary probability.

Let’s find a recursive formula to find the probability of at least 3 birthdays that works for any n value! A recursive formula is based on previous calculations. In this case, the recursive formula will be based on 2 previous probabilities.

Let’s start with complementary probabilities from the beginning: We will continue using the notation P (n,k) where n represents the number of people in the room and k represents the number of pairs of people with the same birthday.

What if n=1?

  • What is the probability that, in a room with 1 person, no pair of people share a birthday ? 1 is the only partition of 1.
The answer is obviously 1. So we can say P(1,0)=1
  • What is the probability that, in a room with 1 person, that one pair of people share a birthday?
The answer to this question is obviously 0. There is only one person in the room so how can we have a pair?

So we can say that P(1,1)=0.

Next, let’s look at n=2.

  • What is the probability that, in a room with 2 people, no pair of people share a birthday? There are two partitions of 2: 1 1 and 2. Here we are calculating P(2,0)
This probability only concerns partition 1 1.
This is equal to
 \frac{365}{365} \times \frac{364}{365} = 0.997
So P(2,0)= 0.997
  • What is the probability that, in a room with 2 people, one pair of people share a birthday? Here we are calculating P(2,1)
This probability only concerns partition 2.
 \frac{365}{365} \times \frac{1}{365} = 0.0027
So P(2,1)=0.0027

Note: P(2,2)=0, P(2,3)=0, etc. so we do not have to worry about these probabilities.

Now let's jump to n=5: Let's look at the partitions of 5.

(a) 1 1 1 1 1
(b) 2 1 1 1
(c) 2 2 1
(d) 3 1 1
(e) 3 2
(f) 4 1
(g) 5
  • Here we have 3 partitions that involve integers less than 3 (a, b, and c), and 4 partitions that involve integers greater than or equal to 3 (d, e, f, and g).

Let’s compute the probabilities of partitions a, b, and c.

(a) 1 1 1 1 1

  • In this case everyone has a different birthday. It can be represented P(5,0). This is computed by
\frac{365}{365} \times \frac{364}{365} \times \frac{363}{365} \times \frac{362}{365} \times \frac{361}{365} = 0.973

(b) 2 1 1 1

  • In this case there is one pair of birthdays. It can be represented P(5,1). We can think of this as the sum of two probabilities:
  1. The probability that, among the first 4 people there is already one pair of birthdays (and therefore the fifth person has a different birthday from everyone before them)
Which can be represented as P_{(4,1)} \times C_1
  1. The probability that, among the first 4 people there are no pairs of birthdays (and therefore the fifth person must share a birthday with one of the first 4 people)
Which can be represented as P_{(4,0)} \times C_2 . So we can represent P(5,1) as
P_{(5,1)}= P_{(4,1)} \times C_1 + P_{(4,0)} \times C_2
C_1 = \frac{365-\text{the birthdays of the first 4 people}}{365} = \frac{365-3}{365} = 0.9917808
C_2 = \frac{\text{the birthdays of the first 4 people}}{365} = \frac{4}{365} = 0.0109589
Once we get P(4,1) and P(4,0) from the chart we are going to make we can say that.
P_{(5,1)}= P_{(4,1)} \times 0.9917808 + P_{(4,0)} \times 0.0109589
P_{(5,1)}= 0.01630349 \times 0.9917808 + 0.98364409 \times 0.0109589
P_{(5,1)}= 0.02694915


(c) 2 2 1

  • In this case there are two pairs of birthdays. It can be represented P(5,2). We can think of this as the sum of two probabilities:
  1. The probability that, among the first 4 people there are already two pairs of birthdays (and therefore the fifth person has a different birthday from everyone before them)
Which can be represented as P_{(4,2)} \times C_a
  1. The probability that, among the first 4 people there is one pair of birthdays (and therefore the fifth person must share a birthday with one of the other two people)
Which can be represented as P_{(4,1)} \times C_b . So we can represent P(5,2) as
P_{(5,2)}= P_{(4,2)} \times C_a + P_{(4,1)} \times C_b
C_a = \frac{365-\text{the two birthday pairs in the first 4 people}}{365} = \frac{365-2}{365} = 0.9945205479
C_b = \frac{\text{the two birthdays in the first 4 that have not been paired}}{365} =\frac{2}{365} = 0.0054794521
Once we get P(4,2) and P(4,1) from the chart we are going to make we can say that.
P_{(5,2)}= P_{(4,2)} \times 0.9945205479 + P_{(4,1)} \times 0.0054794521
P_{(5,2)}= 0.01630349 \times 0.9945205479 + 0.98364409 \times 0.0054794521
P_{(5,2)}= 0.02694915




Now we have all parts of the complementary probability (S'(5)) for n=5

S'_5= P_(5,0) + P_(5,1) + P_(5,2)
S'_5= 0.973 + 0.0027 + 0.0000074 = 0.976
S_5= 1 - S'_5
S_5 = 1- CHECK = PROBABILITY

^I'm not getting the right answer here...





If you are able, please consider adding to or editing this page!

Have questions about the image or the explanations on this page?
Leave a message on the discussion page by clicking the 'discussion' tab at the top of this image page.






Personal tools