Multiple Linear RegressionDate: 01/23/97 at 08:21:22 From: dulud Subject: Multiple linear regression Bonjour from Paris, I am looking for the complete set of formulas for multiple linear regression. Can you help me? Thanks Date: 01/27/97 at 09:10:37 From: Doctor Mitteldorf Subject: Re: multiple linear regression Greetings from Philadelphia! If you understand single-variable linear regression, then multiple regression is just the same thing with matrices and vectors where you had numbers before. Here are the formulas, first for single variable: Say you have a collection of points (x,y), and you want the best line through them. The line will be: y = ax + b where a = (<xy>-<x><y>) / (<x^2>-<x>^2) and b = <y> - a<x> The correlation coefficient r is given by: r = (<xy>-<x><y>) / sqrt{ (<x^2>-<x>^2) * (<y^2>-<y>^2) } In the above, the notation <xy> means "average value of xy": in other words, for each point, multiply x for that point times y for that point, add up all the products, and divide by the number of points. Similarly, <x^2> is the mean value of x^2. You'll recognize the denominator of the expression for a as the variance of x. So you could rewrite formulas as: a = (<xy>-<x><y>) / var(x) r = a * sqrt{ var(x) / var(y) } Now for the multivariate version of the formulas, you must think of x as a vector, but y is still a scalar. y is a function of multiple variables which together are called x. I'll use capital letters for vectors and "." for the dot product of two vectors: A.X means A[1]*X[1] + A[2]*X[2] + ... We're still looking for a linear relationship between x and y, and now it's of the form y = A.X + b. Since X is a vector of n numbers, we look for n coefficients of proportionality, and make scalar a into vector A. In the formula for A, the numerator becomes: (<Xy>-<X><y>) This is easy to interpret. X is a vector, y is a scalar. Every component of X is multiplied by the scalar y. But the denominator takes a little more thought. What do we mean by: (<XX> - <X><X>) This is a 2d rank tensor, which looks like a square matrix. If X has n components, then <XX> has n^2 components. The (i,j) component of this object is made by averaging <X[i]X[j]> over all the points in your sample. <X><X> is the matrix that you make just by multiplying out all possible combinations of the vectors X. The (i,j) component of <X><X> is given by <X[i]><X[j]>; in other words, separately average the X[i] components for all points and the X[j] components for all points, then just multiply those two together. <XX> and <X><X> are both matrices. Subtract one from the other to get the "denominator" matrix corresponding to var(X). Then you must "divide" this matrix into the numerator vector. The way to do this is to invert the matrix, then multiply. Symbolically, you could write the steps this way: Let vector V = (<Xy>-<X><y>) Let matrix M = (<XX>-<X><X>) Then let vector A = Inv(M) * V Also, r^2 = Inv(M) * V The inverse of the matrix M is another matrix. The product of that matrix with a vector is another vector. Finally, b is just a scalar, and the formula for b is just as before, with A and X becoming vectors: b = <y> - A.<X> I hope this helps. Don't hesitate to write again if any part is still not clear. -Doctor Mitteldorf, The Math Forum Check out our web site! |
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]
Ask Dr. Math^{TM}
© 1994- The Math Forum at NCTM. All rights reserved.
http://mathforum.org/dr.math/