Date: 02/05/97 at 02:30:49 From: Rizzo Subject: Correlation coefficient Dear Dr. Math, When working with the formula for the correlation coefficient (r), we divide the covariance by the product of the two standard deviations s(x) and s(y). (1) Can you explain in simple terms what a covariance represents? (2) Why, or how, does dividing the covariance by the product of the two standard deviations `bind' the correlation coefficient between -1 and 1? Many thanks, Mike
Date: 02/06/97 at 10:41:53 From: Doctor Mitteldorf Subject: Re: Correlation coefficient Dear Mike, Covariance is the tendency of two random variables to move in tandem. It's important in survey-taking and sociology as well as in many branches of science. This is because, if two things tend to vary together, there's a good chance they may be causally linked. For example, you could survey a thousand college grads, ask how much each of them earns at their jobs as well as what they got on their college boards. If there's some relationship between doing well on college boards and doing doing well in the job market, when you calculate the statistic, you'll find a positive correlation coefficient between the two variables. Another example: You could determine from hospital records the age at death for 1000 people and find out how many cigarettes each smoked. If you calculated the correlation coefficient, you'd find that there was a negative correlation between age at death and number of cigarettes smoked. Both of these statistical relationships suggest a causal relationship: there's probably something about doing well on college boards that leads a person to get a higher-paying job. There's probably something about smoking cigarettes that tends to shorten life. ---------------------------------- Only if the relationship between the two variables is strict and absolute will the correlation coefficient be 1. In fact, you can use your formula for correlation coefficient between x and y and let each y[i] be a constant times the corresponding x[i]: y[i] = ax[i] Calculate the correlation coefficient, do a little algebra, and you'll find it's 1 or -1, depending on whether a is positive or negative. So if the two variables are something like "the weight of each member of your class in kg and the corresponding weight in pounds," then you'll find a correlation coefficient of 1; if the relationship also has some "randomness" in it, the correlation coefficient will be between -1 and 1. (Curiously, if the relationship has no randomness but it's not linear, then the methods you're using will just signal "randomness". For example, if x is the a radius of a circle and y is its area, then there's a proportionality between y and sqr(x). This will show up as a correlation coefficient < 1 between x and y. Try it!) -Doctor Mitteldorf, The Math Forum Check out our web site! http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.