Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: SUMMARY- *at least* Y *in a row* probability question
Replies: 0

 Richard Lay Posts: 9 Registered: 12/12/04
SUMMARY- *at least* Y *in a row* probability question
Posted: Aug 13, 1996 9:46 AM

Thanks to all that helped me with my probability question. I know much
more about the problem now. Special thanks to Tim Firman, who spent much
time on the problem and answering my questions. I also thank Jim Landis,
whose solution appears first below. His is the one I implemented both in
C (recursive and non-recursive forms) and MATHCAD (non-recursive only).
If anyone would like a copy of either of these please let me know. And
finally, Ken Butler went at the problem from a different point of view.
I have only just received his solution, but it looks interesting, and I
post it as it approaches the problem from another prospective. I
received many more replies besides these. Thanks to all who responded.
You are the ones who make the internet such a valuable resource.

Regards,
Rich Lay

***************************************
Original question:
If the probability of an individual event occurring (heads on a coin, 3
on a die, an error in a received ALE word) is P, then within a sequence
of S events (S coin flips/die rolls/ALE words received), what is the
probability that *at least* Y of these events will occur *in a row* ?
***************************************
Jim Landis solution:
I have a solution that is a little heavy in computation, but it does
solve the problem exactly, with out double counting or miscounting
any cases.

I don't know of any simple solution to that problem in "closed"
notation, but it yields very easily to a recursive formula, i.e. a
formula defined in terms of itself.

Define f(n, Y, S, p) as follows :
f(0, Y, S, p) = 1
f(n, Y,S, p) = 0 for all n > S >= 0
f(n, Y, S, p) = p*f(n-1, Y, S-1,p) + (1-p)*f(Y, Y, S-1, p)

n represents the number of successes needed to complete the current

"streak", and should initially be set to Y
Y represents the total number of successes needed in a row
S is the number of remaining trials
p is the probability of success on any trial

f() could be interpreted as "the probability that you will have a
streak of n successes immediately, or a streak of Y successes later
on". The two terms evaluate the remaining probability after each roll
of the die. First term corresponds to success on the current roll, the
second term to failure.

So the probability of an unbroken string of four 3's in 20 rolls of a
fair die is f(4,4,20, 1/6) or approximately 0.011026 .

You can verify that Y=1, S=1 yields a probability p, as you would
expect. You can also verify that Y=S yields p^^S.

This can be implemented in a simple computer program. However, because

of the double recursion, the run time of the program is O( 2^^(S-Y) ).
The formula is essentially called once for each of the 2 to the power
of S outcomes. For S=20, that is more than a million times. Beyond
S = 30 or so, that approach is so slow as to be impractical.

A non-recursive program could be written based on an S +1 by Y+1
array of values, and running in time O( (S*Y)^^2). The approach would
be to create an array. Initialize one column of the array to value 1,
based on the first rule. Initialize one half of the array to 0, as
per the second rule. Every remaining cell of the array can be
calculated from 2 other cells by the function definition.
***********************************************
Ken Butler solution:
Anyway, assuming that independence and constant probability are OK here,
my
line of thinking is the following: let N be the total number of words
sent
when Y consecutive words are received incorrectly for the first time. For
example, if Y=3, in the following sequence of words sent (C = correctly

I C I I C C I I I

N = 9, since 9 words have been sent by the time the 3rd consecutive I
whereas if N <= S, the link is disestablished. So prob(N>S) is the
probability you want. (This takes care of the "at least" problem, because
once Y consecutive I's show up, we stop looking.) You also see that the
last
four (=Y+1) words sent are a C followed by K I's; this must be the way
the
process stops, because if we'd gotten any more consecutive I's, we would
have
stopped earlier, and if we'd gotten any fewer, we'd keep looking.)

To show what happens with Y=2:

N cannot be 0 or 1;
N = 2 if we get I I, whose probability is P^2.
N = 3 if we get C I I, with prob. (1-P)*P^2.
N = 4 if we get I C I I or C C I I, ie. the last 3 are C I I no matter
what
the first is. Probability is (1-P)*P^2 as for N=3.

Then, a pattern:

N = 5 if we survive the first 2 words and then get C I I;
N = 6 if 3 C I I;
.
.
N = m if m-3 C I I,

so that prob(N = m) = prob(N > m-3) * (1-P) * P^2.

Overall, the probability is 0 for a bit, P^2 once, (1-P)*P^2 for a bit,
and
then the above relationship holds. This was for Y=2, but the same idea
generalizes to other Y:

Prob(N = m) = : 0 if m < Y-1
P^Y if m = Y
(1-P) * P^Y * prob(N > m-Y-1) if m > Y.

Because each probability depends on previous ones, you'll need to
calculate
these probabilities one at a time, starting from 0 (or Y) and continuing
up
to S. It's probably easiest to keep track of prob(N > m), since this is
the
quantity of interest anyway, noting that prob(N > m) = prob(N > m-1) -
prob(N = m), and prob(N > 0) = 0.
*****************************************************************