Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



SUMMARY *at least* Y *in a row* probability question
Posted:
Aug 13, 1996 9:46 AM


Thanks to all that helped me with my probability question. I know much more about the problem now. Special thanks to Tim Firman, who spent much time on the problem and answering my questions. I also thank Jim Landis, whose solution appears first below. His is the one I implemented both in C (recursive and nonrecursive forms) and MATHCAD (nonrecursive only). If anyone would like a copy of either of these please let me know. And finally, Ken Butler went at the problem from a different point of view. I have only just received his solution, but it looks interesting, and I post it as it approaches the problem from another prospective. I received many more replies besides these. Thanks to all who responded. You are the ones who make the internet such a valuable resource.
Regards, Rich Lay
*************************************** Original question: If the probability of an individual event occurring (heads on a coin, 3 on a die, an error in a received ALE word) is P, then within a sequence of S events (S coin flips/die rolls/ALE words received), what is the probability that *at least* Y of these events will occur *in a row* ? *************************************** Jim Landis solution: I have a solution that is a little heavy in computation, but it does solve the problem exactly, with out double counting or miscounting any cases.
I don't know of any simple solution to that problem in "closed" notation, but it yields very easily to a recursive formula, i.e. a formula defined in terms of itself.
Define f(n, Y, S, p) as follows : f(0, Y, S, p) = 1 f(n, Y,S, p) = 0 for all n > S >= 0 f(n, Y, S, p) = p*f(n1, Y, S1,p) + (1p)*f(Y, Y, S1, p)
n represents the number of successes needed to complete the current
"streak", and should initially be set to Y Y represents the total number of successes needed in a row S is the number of remaining trials p is the probability of success on any trial
f() could be interpreted as "the probability that you will have a streak of n successes immediately, or a streak of Y successes later on". The two terms evaluate the remaining probability after each roll of the die. First term corresponds to success on the current roll, the second term to failure.
So the probability of an unbroken string of four 3's in 20 rolls of a fair die is f(4,4,20, 1/6) or approximately 0.011026 .
You can verify that Y=1, S=1 yields a probability p, as you would expect. You can also verify that Y=S yields p^^S.
This can be implemented in a simple computer program. However, because
of the double recursion, the run time of the program is O( 2^^(SY) ). The formula is essentially called once for each of the 2 to the power of S outcomes. For S=20, that is more than a million times. Beyond S = 30 or so, that approach is so slow as to be impractical.
A nonrecursive program could be written based on an S +1 by Y+1 array of values, and running in time O( (S*Y)^^2). The approach would be to create an array. Initialize one column of the array to value 1, based on the first rule. Initialize one half of the array to 0, as per the second rule. Every remaining cell of the array can be calculated from 2 other cells by the function definition. *********************************************** Ken Butler solution: Anyway, assuming that independence and constant probability are OK here, my line of thinking is the following: let N be the total number of words sent when Y consecutive words are received incorrectly for the first time. For example, if Y=3, in the following sequence of words sent (C = correctly received, I = incorrect):
I C I I C C I I I
N = 9, since 9 words have been sent by the time the 3rd consecutive I appears. Relating this to your problem, if N>S, then the radios link up, whereas if N <= S, the link is disestablished. So prob(N>S) is the probability you want. (This takes care of the "at least" problem, because once Y consecutive I's show up, we stop looking.) You also see that the last four (=Y+1) words sent are a C followed by K I's; this must be the way the process stops, because if we'd gotten any more consecutive I's, we would have stopped earlier, and if we'd gotten any fewer, we'd keep looking.)
To show what happens with Y=2:
N cannot be 0 or 1; N = 2 if we get I I, whose probability is P^2. N = 3 if we get C I I, with prob. (1P)*P^2. N = 4 if we get I C I I or C C I I, ie. the last 3 are C I I no matter what the first is. Probability is (1P)*P^2 as for N=3.
Then, a pattern:
N = 5 if we survive the first 2 words and then get C I I; N = 6 if 3 C I I; . . N = m if m3 C I I,
so that prob(N = m) = prob(N > m3) * (1P) * P^2.
Overall, the probability is 0 for a bit, P^2 once, (1P)*P^2 for a bit, and then the above relationship holds. This was for Y=2, but the same idea generalizes to other Y:
Prob(N = m) = : 0 if m < Y1 P^Y if m = Y (1P) * P^Y * prob(N > mY1) if m > Y.
Because each probability depends on previous ones, you'll need to calculate these probabilities one at a time, starting from 0 (or Y) and continuing up to S. It's probably easiest to keep track of prob(N > m), since this is the quantity of interest anyway, noting that prob(N > m) = prob(N > m1)  prob(N = m), and prob(N > 0) = 0. *****************************************************************



