Interesting Counting Probability Problem
Date: 11/19/2004 at 23:12:25 From: john Subject: Probability of an election In an election two candidates, Atif and Bryan, have in a ballot box a and b votes repectively with a > b, for example 3 and 2. If ballots are randomly drawn and tallied, what is the chance that at least once after the first tally the candidates have the same number of tallies? The lack of given values makes this very difficult. I think that this is a hyper-geometric distribution, but I'm probably wrong.
Date: 11/20/2004 at 10:01:13 From: Doctor Jacques Subject: Re: Probability of an election Hi John, There is a nice trick for solving this problem. It is attributed to D. André (1840 - 1917). The number of possible tallying processes is the number of ways of choosing (a) positions out of (a+b), i.e.: N = C(a+b,a) = (a+b)!/(a!*b!)  We can represent the process of counting the votes as a graph, where the x-axis is the number of votes counted so far, and the y-axis is the difference between the votes for A and those for B. The graph is a "sawtooth" line: at each step (when one vote is counted), the difference (y) changes by +1 or -1. We want to count the number of possible lines where the difference reaches 0 at some point for x >= 1, i.e., the number of lines that cross the x-axis at least once for x > 0. Every line starts from (0,0) and ends at (a+b, a-b), where a - b > 0. There are three possible types of lines: (1) The first vote is for A, and the line never crosses the x-axis. Let us call n1 the number of such lines. (2) The first vote is for A, and the line crosses the x-axis at least once. Let us call n2 the number of such lines. (3) The first vote is for B; in this case, the line always crosses the x-axis, since it starts below it and ends above it. Let us call n3 the number of such lines. As these cases are mutually exclusive and exhaust all possibilities, we have: N = n1 + n2 + n3  In cases (1) and (2), the line passes through (1,1), and, in case (3), the line passes through (1,-1). The cases where A and B have the same number of votes at some point are (2) and (3), and the probability we want is: (n2 + n3)/N  Consider a line of type (2). Such a line starts with (1,1) and reaches the x-axis for the first time at some point (k,0). If we reflect the portion of that line between x = 0 and x = k across the x-axis, we get a line of type (3). Conversely, given a line of type (3), such a line always crosses the x-axis, and, if (k,0) is the first crossing point, by reflecting the portion of the line between 0 and k, we get a line of type (2). We can therefore establish a bijection between lines of type (2) and (3), and this means that: n2 = n3  Now, n3 is simply the number of possible lines from (1,-1) to (a+b,a-b). This is the number of lines we would get by ignoring the first vote (for B); this corresponds to all the possible outcomes (in any order) where A gets (a) votes and B gets (b-1) votes, for a total of a+b-1. The number of such lines is therefore: n3 = C(a+b-1,a) = (a+b-1)! / (a!*(b-1)!) Can you continue from here? Please feel free to write back if you require further assistance. - Doctor Jacques, The Math Forum http://mathforum.org/dr.math/
Date: 11/20/2004 at 18:32:22 From: john Subject: Probability of an election Thanks for the quick reply, but I need some clarification. I understood most of your help and have concluded that the solution should be 2(C(a+b-1,a)) / C(a+b,a). However, I'm not sure that I understand the part of your explanation about reflecting the graph on the x-axis and why doing so proves that n2 = n3. Thanks for your help.
Date: 11/21/2004 at 03:23:05 From: Doctor Jacques Subject: Re: Probability of an election Hi again John, This may be easier to see on an example. Assume that a = 5 and b = 3. The total number of possible tallying orders is therefore: N = 8! / (5!*3!) = 56 All lines will start at (0,0) and end at (8,2), in increments of +1 or -1. One possible order could be: a a b a a b a b  and this would be represented by the line: x | 0 1 2 3 4 5 6 7 8 --+------------------ y | 0 1 2 1 2 3 2 3 2 a a b a a b a b where you notice that y increases for each "a" and decreases for each "b" (y is the number of "a" votes minus the number of "b" votes tallied so far). It is not that easy to draw curves in this kind of mail message, but let us give it a try: 3 | * * | 2 | * * * * | 1 | * * | 0 +---1---2---3---4---5---6---7---8 | a a b a a b a b In the terminology of my previous message, this is a "type 1" path: it starts at (1,1) and never crosses the x-axis, which means that A always leads. The number of such paths is n1. Another possible order could be: a a b b b a a a  which gives the line: 2 | * * | 1 | * * * | 0 +---1---2---3---*---5---*---7---8 | a a b b b b a a -1 | * This line starts at (1,1) and crosses the x-axis the first time at (4,0)--this is a "type 2" pattern, counted in n2. Note that, after 4 votes, the difference is 0, which means that there were the same number of votes for a and b in these 4 votes. If we interchange a and b in these 4 votes, the result is therefore the same, and this gives the order: b b a a b a a a  which we can represent as: 2 | * | 1 | * | 0 +---1---2---3---*---5---*---7---8 | b b a a b b a a -1 | * * * | -2 | * The part of this line between 0 and 4 is the mirror image (with respect to the x-axis) of the corresponding part of the previous line: in that section, we replaced y by -y. In particular, the line starts at (1,-1)--this is a type 3 line, counted in n3. The point is that we can do that operation for any type 2 sequence and obtain a uniquely defined type 3 sequence; conversely, as any type 3 sequence must cross the x-axis at least once, we can obtain a uniquely defined type 2 sequence by the same process. For example, starting with the type 3 sequence: b a a a b a b a  we notice that the sequence crosses the x-axis the first (and only) time at x = 2. If we interchange a and b in the first two votes, we get a corresponding uniquely defined type 2 sequence: a b a a b a b a  (I'll let you do the drawing this time). This shows that every type 2 sequence corresponds to a unique type 3 sequence, and conversely; the corresponding number of possible sequences (n2 and n3) are therefore the same. The interesting point is that the number of type 3 sequences (n3) is easy to compute: we simply have to count the number of _all_ possible paths from (1,-1) to (8,2), which corresponds to all the possible sequences in an election where A gets 5 votes and B gets 2 votes, since we discarded the first vote for B. The number of such sequences is: n3 = 7!/(5!*2!) = 21 and the probability of crossing the x-axis at least once is: (n2 + n3)/N = 2*n3/N = 42/56 If you write the binomial coefficients explicitly using factorials, you can reduce the above expression to a very simple expression in terms of a and b. Does this clarify things? - Doctor Jacques, The Math Forum http://mathforum.org/dr.math/
Date: 11/21/2004 at 08:30:33 From: john Subject: Probability of an election Wow! It all makes sense now. Thanks a lot for your help.
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.