Two Random Variables, Each Correlated to a Third
Date: 04/29/2003 at 22:00:49 From: Huaguang Subject: How two random variables, each correlated to a third, are correlated to each other If X, Y, and Z are 3 random variables such that X and Y are 90% correlated, and Y and Z are 80% correlated, what is the minimum correlation that X and Z can have?
Date: 04/30/2003 at 07:22:53 From: Doctor Mitteldorf Subject: Re: How two random variables, each correlated to a third, are correlated to each other Dear Hua-guang, If X is correlated with Y and Y is correlated with Z, then X and Z must be correlated with each other. If the two separate correlation coefficients are 0.9 and 0.8 respectively, then the correlation of X and Z can be as much as (0.9)(0.8) + sqrt((1-0.9^2)(1-0.8^2)) or as little as (0.9)(0.8) - sqrt((1-0.9^2)(1-0.8^2)), but it cannot be outside this range of 0.458 to 0.982. Here's the outline of a proof. To begin, let's assume that X has a mean of zero and a variance of 1. You can always subtract its mean and then divide by its standard deviation to achieve this, and the correlation coefficients are unchanged. Do the same with Y and Z, so that each of them has mean 0 and variance 1. Next, express Xi as a linear combination of two parts, one part perfectly correlated with Yi, the other part completely independent of Yi. So we write Xi = A Yi + X*i, where X*i is, by assumption, uncorrelated to Yi. Then we have <XY> = A <Y^2> + <YX*> By our assumption <YX*>=0, and by definition <XY> is the correlation coefficient of X and Y (since their means are 0 and variances are 1). So we have identified A as equal to the correlation, A = Rxy Now let's compute the variance of X, which we have already normalized to unity: 1 = <X^2> = A^2 <Y^2> + 2 Rxy <YX*> + <X*^2> 1 = (Rxy)^2 + 0 + <X*^2> So we have discovered that the variance of X* is 1 - (Rxy)^2. Working with Y and Z instead of Y and X, we can derive a similar relation for the variance of Z*: <X*^2> = 1 - (Rxy)^2 <Z*^2> = 1 - (Rzy)^2 Now we are ready to compute the correlation of X and Z, writing Xi as Xi = (Rxy) Yi + X*i and Zi as Zi = (Ryz) Yi + Z*i. Rxz = <XZ> = (Rxy)(Ryz) <Y^2> + (Rxy)<YZ*> + (Ryz)<YX*> + <X*Z*> Remember that <Y^2>=1 and the middle two terms are zero. (Rxz) = (Rxy)(Ryz) + <X*Z*> (Remember that X* and Z* do have means of 0, but they do not have unit variance.) If X* and Z* happen to be perfectly correlated (R=1), then the greatest value that <X*Z*> can have is the product of the two individual standard deviations. If they are perfectly anti-correlated (R=-1), then the smallest value that <X*Z*> can have is minus the same product. (Rxz) = (Rxy)(Ryz) +/- sqrt(<X*^2><Z*^2>) (Rxz) = (Rxy)(Ryz) +/- sqrt((1-(Rxy)^2)(1-(Rzy)^2)) (where the +/- is intended to indicate the range of values Rxz can take, not that it has to assume one extreme or the other.) - Doctor Mitteldorf, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.