Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: Test constantness of normality of residuals from linear regression
Replies: 9   Last Post: Jan 12, 2013 7:01 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Ray Koopman

Posts: 3,383
Registered: 12/7/04
Re: Test constantness of normality of residuals from linear regression
Posted: Jan 10, 2013 12:48 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Jan 9, 7:35 pm, Paul <paul.domas...@gmail.com> wrote:
> After much browsing of Wikipedia and the web, I used both normal
> probability plot and Anderson-Darling to test the normality of
> residuals from a simple linear regression (SLR) of 6 data points.
> Results were very good. However, SLR doesn't just assume that the
> residuals are normal. It assumes that the standard deviation of the
> PDF that gives rise to the residuals is constant along the horizontal
> axis. Is there a way to test for this if none of the data points have
> the same value for the independent variable? I want to be able to
> show that there is no gross curves or spreading/focusing of the
> scatter.
>
> In electrical engineering signal theory, the horizontal axis is time.
> Using Fourier Transform (FT), time-frequency domains can show trends.
> Intuitively, I would set up the data as a scatter graph of residuals
> plotted against the independent variable (which would be treated as
> time). Gross curves show up as low-frequency content. There should
> be none if residuals are truly iid. The spectrum should look like
> white noise. The usual way to get the power spectrum is the FT of the
> autocorrelation function, which itself should resemble an impulse at
> zero. This just shows indepedence of samples, not constant iid normal
> along the horizontal axis.
>
> As for spreading or narrowing of the scatter, I guess that can be
> modelled in time as a multiplication of a truly random signal by a
> linear (or exponential) attenuation function. The latter acts like a
> modulation envelope. Their power spectrums will then convolve in some
> weird way. I'm not sure if this is a fruitful direction for
> identifying trends in the residuals. It starts to get convoluted
> pretty quickly.
>
> Surely there must be a less klugy way from the world of statistics? I
> realize that my sample size will probably be too small for many
> conceptual approaches. For example, if I had a wealth of data points,
> I could segment the horizontal axis, then do a normality test on each
> segment. This would generate mu's and sigma's as well, which could
> then be compared across segments. So for the sake of conceptual
> gratification, I'm hoping for a more elegant test for the ideal case
> of many data points. If there is also a test for small sample sizes,
> so much the better (though I don't hold my breath).


If y|x = a + b*x + e, where the errors are iid random variables with
zero means, and you do an ordinary least squares fit of that model to
(x1,y1), ..., (xn,yn), then the theoretical variance of the residual
for xi is 1 - 1/n - [(xi-m)^2 / sum{(xj-m)^2}], where m is the mean
of x1, ..., xn. In words, residuals whose x is far from the mean tend
to be smaller than those whose x is hear the mean. (This is known as
"leverage": points far from the mean have more "leverage" on the
regression line, pulling it closer to them.) Note that normality is
not required.



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.