Orthogonal Distance Regression Line
Date: 07/05/2005 at 14:19:16 From: Ameer Subject: least square fitting How can I calculate the standard deviation of the error in the independent and dependent variables when I want to fit a straight line with both variables subject to error? I have (x,y) paris data with both subject to error and I don't know how to calculate the standard deviation of the error in x's and y's. I am really confused about this topic. I think it's different from standard deviation of x's and y's and different from standard error of x and y.
Date: 07/06/2005 at 11:35:50 From: Doctor George Subject: Re: least square fitting Hi Ameer, Thanks for writing to Doctor Math. Since you have error in both variables it may not be helpful to think in terms of independent and dependent variables. The error of interest in this problem is typically measured in terms of the distances from the points to the line. How are you computing your best fit line? It sounds like you need what is called the orthogonal distance regression line. Here is a method for finding the orthogonal distance regression (ODR) line in 2D. Let the line pass through the point (h,k) with direction vector (cos(theta),sin(theta)). The sum of squared distances from the points (xi,yi) to the line will be f(h,k,theta) = SUM[-(xi-h)sin(theta)+(yi-k)cos(theta)]^2 (1) If we take the partial derivative with respect to h and set it equal to zero we get this. SUM 2[-(xi-h)sin(theta) + (yi-k)cos(theta)]sin(theta) = 0 For the nontrivial cases we can simplify this to (-x0+h)sin(theta) + (y0-k)cos(theta) = 0 where (x0,y0) is the centroid of the data. The point (h,k) = (x0,y0) satisfies this equation, which means that the ODR line passes through the centroid of the data. Now we need to find theta. There are two main methods. The first is to rewrite equation (1) as a Rayleigh quotient and minimize it using the Singular Value Decomposition. This approach is closely related to finding the ODR plane. If you are interested in this method see if you can work out the details from Orthogonal Distance Regression Planes http://mathforum.org/library/drmath/view/63765.html . The second method is to set the derivative of f with respect to theta equal to zero, and then solve for theta. If we do this we get 2 * SUM[(xi-x0)(yi-y0)] tan(2*theta) = -------------------------- SUM[(xi-x0)^2 - (yi-y0)^2] See if you can verify my result. There will be two solutions for theta, one for the minimum sum, and one for the maximum. Choose theta for the minimum. Write again if you need more help. - Doctor George, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.