The Math Forum

Ask Dr. Math - Questions and Answers from our Archives
Associated Topics || Dr. Math Home || Search Dr. Math

Sequence of Letters

Date: 04/23/99 at 05:06:27
From: Robles Mellin
Subject: Theory of runs problem

Hi. I have tried without success to solve the following problem:

Suppose you have an alphabet of 10 letters and a sequence of 600
letters, what is the probability of finding the subsequence DDDD ?

For instance:


There are: 10^600 possible sequences. How can I find the probability
that subsequence DDDD appears in those 10^600 sequences?

Thank you.

U. Robles

Date: 04/23/99 at 14:36:03
From: Doctor Anthony
Subject: Re: Theory of runs problem

Without going through some very involved combinatorics we could 
approach the problem in a purely probabilistic way.

Imagine drawing letters one at a time, each letter having an equal 
probability of 1/10 at each draw. We require first the expected number 
of draws to the first occurrence of 4 consecutive D's.

Let a = expected number of draws to first D.

We must make 1 draw at least and we have probability 9/10 of returning 
to a, so

         a = 1 + 9/10 a

   (1/10)a = 1

         a = 10

Let E = expected number of draws to 2 consecutive D's.

Consider that we have just drawn a D and what happens on the next 
draw. We are dealing with the (a+1)th draw. With probability 9/10 this 
is not a D and we return to E.

So      E = (a+1).1 + (9/10)E     note we must have at least a+1 draws 
                                  so p = 1
  (1/10)E = a+1

        E = 10(a+1)

and now putting in the value a=10 we get  E = 10(11) = 110
The expected number of letters to 2 consecutive D's is 110.

The equation for 3 consecutive D's is:

  E = (110+1).1 + (9/10)E

  (1/10)E = 111   and E = 1110

The equation for 4 consecutive D's is:

  E = (1111).1 + (9/10)E

  (1/10)E = 1111 and  E = 11110

So the expected number of letters to 4 consecutive D's is 11110.

In 600 letters the expected number of occurrences of 4 D's is

   600/11110 =  0.0540054

Using the Poisson distribution with parameter  m = 0.0540054  we get

   P(0) =  e ^(-0.0540054)  =  0.947427

and probability of at least one occurrence of 4 D's is 1 - 0.947427

                            =  0.052573

- Doctor Anthony, The Math Forum   
Associated Topics:
High School Probability

Search the Dr. Math Library:

Find items containing (put spaces between keywords):
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

Math Forum Home || Math Library || Quick Reference || Math Forum Search

Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.