Date: Feb 16, 2013 1:18 PM Author: RGVickson@shaw.ca Subject: Re: Variance of the recursive union of events On Friday, February 15, 2013 9:06:38 PM UTC-8, Paul wrote:

> On Feb 15, 10:50 pm, quasi <qu...@null.set> wrote:

>

> >Paul wrote:

>

> >>Paul wrote:

>

> >>>

>

> >>> Using a simplification of the notation in the paper,

>

> >>> consider variance of the recursive relationship:

>

> >>>

>

> >>> 0) c(n) = u(n) + c(n-1) - c(n-1) u(n)

>

> >>>

>

> >>> for n=1,2,... and c(0)=0. All c(n) and u(n) values represent

>

> >>> probabilities i.e. lie with [0,1]. Furthermore, in the above

>

> >>> expression (0), u(n) and c(n-1) are independent.

>

> >>

>

> >> Actually, consider:

>

> >>

>

> >> U(n) = event to which probability u(n) is assigned

>

> >> C(n-1) = event to which probability c(n-1) is assigned

>

> >> C(n) = union[ U(n) , C(n-1) ]

>

> >>

>

> >> It is the *events* U(n) and C(n-1) that are independent, not

>

> >> the probabilities u(n) and c(n-1).

>

> >>

>

> >>> In evaluating the variance of (0), the indices are rather

>

> >>> meaningless, as we are completely focused on the right hand

>

> >>> side of the equation. I only include them in case a reader

>

> >>> has access to the paper. The variance of (0) is presented as:

>

> >>>

>

> >>> 1) var c(n) = var u(n) + var C(n-1) + var[ c(n-1) u(n) ]

>

> >>> - 2 cov[ u(n) , c(n-1) ]

>

> >>> - 2 cov[ c(n-1) , c(n-1) u(n) ]

>

> >>>

>

> >>> According to the above wikipedia page, however, it should be:

>

> >>>

>

> >>> 2) var c(n) = var u(n) + var C(n-1) + var[ c(n-1) u(n) ]

>

> >>> - 2 cov[ u(n) , c(n-1) ]

>

> >>> - 2 cov[ u(n) , c(n-1) u(n) ]

>

> >>> - 2 cov[ c(n-1) , c(n-1) u(n) ]

>

> >>>

>

> >>> Since u(n) and c(n-1) are independent, their covariance

>

> >>> disappears, so (2) becomes:

>

> >>>

>

> >>> 3) var c(n) = var u(n) + var C(n-1) + var[ c(n-1) u(n) ]

>

> >>> - 2 cov[ u(n) , c(n-1) u(n) ]

>

> >>> - 2 cov[ c(n-1) , c(n-1) u(n) ]

>

> >>>

>

> >>> This still differs from (1). It is plausible that (1) is a

>

> >>> typo,though not all that likely.

>

> >

>

> > Actually, it's very likely.

>

> >

>

> > I think you should _assume_ it's a typo, correct it, and see if

>

> > the corrected version is consistent with the rest of the paper.

>

>

>

> The problem is that this is only a drive-by perusal of the paper,

>

> since it is plumbing far deeper than I can rationalize for the results

>

> that I'll be using (which occur in a paper that is 2 citations removed

>

> from this one). I'd love to be able to take the time and become

>

> familiar with the weird and wonderful PDFs and to program the Monte

>

> Carlo simulation that follows the above steps, but it's just not in

>

> the cards right now. I was hoping that key derivation steps would

>

> make sense, and I could mentally give it the green label of "Yup,

>

> seems quite reasonable, onward ho". Even that's going to take some

>

> time considering the new territory (for me) that it wanders into, so I

>

> was hoping to get a confirmation on the above logic thus far.

The sure-fire way to compute variance of a random variable Y is as VarY = E(Y^2) - (EY)^2. In your case, look at Y = X + U - U*X (using Y instead of c(n), U instead of u(n) and X instead of u(n-1)).

You can compute E(Y^2) - (EY)^2 using the following (where we use VU, MU, VX, MX as Var(U), EU, Var(X) and EX, respectively) and apply the following:

E(X^2) = VX + MX^2, E(U^2) = VU + MU^2, E(U^2*X^2) = E(U^2)* E(X^2),

E(X*U) = MX*MU, E(X*U^2) = MX*E(U^2), E(X*U^2) = MX*E(U^2). Finally, you will end up with

Var(Y) = VX + VU + VX*VU + VX*MU^2 + MX^2 * VU - 2*MX*VU - 2*VX*MU.

One can also apply the results that Cov(A,B) = E(A*B) - EA * EB to the paper's expression and to yours, and compare the results with Var(Y) above. Yours works and the paper's fails.