Drexel dragonThe Math ForumDonate to the Math Forum

Ask Dr. Math - Questions and Answers from our Archives
_____________________________________________
Associated Topics || Dr. Math Home || Search Dr. Math
_____________________________________________

Orthogonal Distance Regression Line

Date: 07/05/2005 at 14:19:16
From: Ameer
Subject: least square fitting

How can I calculate the standard deviation of the error in the 
independent and dependent variables when I want to fit a straight 
line with both variables subject to error?

I have (x,y) paris data with both subject to error and I don't know 
how to calculate the standard deviation of the error in x's and y's.
 
I am really confused about this topic.  I think it's different from
standard deviation of x's and y's and different from standard error of
x and y.



Date: 07/06/2005 at 11:35:50
From: Doctor George
Subject: Re: least square fitting

Hi Ameer,

Thanks for writing to Doctor Math.

Since you have error in both variables it may not be helpful to think
in terms of independent and dependent variables.  The error of 
interest in this problem is typically measured in terms of the 
distances from the points to the line.

How are you computing your best fit line?  It sounds like you need 
what is called the orthogonal distance regression line.

Here is a method for finding the orthogonal distance regression (ODR)
line in 2D.  Let the line pass through the point (h,k) with direction
vector (cos(theta),sin(theta)).

The sum of squared distances from the points (xi,yi) to the line will be

  f(h,k,theta) = SUM[-(xi-h)sin(theta)+(yi-k)cos(theta)]^2     (1)

If we take the partial derivative with respect to h and set it equal
to zero we get this.

  SUM 2[-(xi-h)sin(theta) + (yi-k)cos(theta)]sin(theta) = 0 

For the nontrivial cases we can simplify this to

  (-x0+h)sin(theta) + (y0-k)cos(theta) = 0

where (x0,y0) is the centroid of the data.  The point (h,k) = (x0,y0)
satisfies this equation, which means that the ODR line passes through 
the centroid of the data.

Now we need to find theta.  There are two main methods.  The first is 
to rewrite equation (1) as a Rayleigh quotient and minimize it using 
the Singular Value Decomposition.  This approach is closely related to
finding the ODR plane.  If you are interested in this method see if 
you can work out the details from

  Orthogonal Distance Regression Planes
    http://mathforum.org/library/drmath/view/63765.html .

The second method is to set the derivative of f with respect to theta
equal to zero, and then solve for theta.  If we do this we get

                        2 * SUM[(xi-x0)(yi-y0)]
        tan(2*theta) = --------------------------
                       SUM[(xi-x0)^2 - (yi-y0)^2]

See if you can verify my result.  There will be two solutions for
theta, one for the minimum sum, and one for the maximum.  Choose theta
for the minimum.

Write again if you need more help.

- Doctor George, The Math Forum
  http://mathforum.org/dr.math/ 
Associated Topics:
College Linear Algebra
College Statistics

Search the Dr. Math Library:


Find items containing (put spaces between keywords):
 
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

_____________________________________
Math Forum Home || Math Library || Quick Reference || Math Forum Search
_____________________________________

Ask Dr. MathTM
© 1994-2013 The Math Forum
http://mathforum.org/dr.math/