Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Test constantness of normality of residuals from linear regression
Replies: 9   Last Post: Jan 12, 2013 7:01 PM

 Messages: [ Previous | Next ]
 Herman Rubin Posts: 399 Registered: 2/4/10
Re: Test constantness of normality of residuals from linear regression
Posted: Jan 12, 2013 7:01 PM

On 2013-01-10, Michael Press <rubrum@pacbell.net> wrote:
> In article
> Ray Koopman <koopman@sfu.ca> wrote:

> [...]

>> It all depends on what you want. Look up the Gauss-Markov theorem.
>> To justify the usual OLS estimates of the regression coefficients,
>> the errors need only to be unbiased, uncorrelated, and homoscedastic,
>> but to justify all the usual p-values and confidence regions, the
>> errors must be iid normal.

>> However, that's considering only the theoretical justification.
>> In practice, what matters is not whether the assumptions are right
>> or wrong, but how wrong they are -- they're never exactly right.

>> Normality is probably the least important assumption. The most
>> important things to worry about are the general form of the model
>> and whether it includes all the relevant predictor variables. Then
>> you ask how correlated and/or heteroscedastic the errors might be.
>> Finally, you might wonder about shapes of the error distributions.
>> Minor departures from normality are inconsequential. Nothing in the
>> real world is exactly normal, and any test of normality will reject
>> if the sample size is big enough.

> Assuming that the errors are normally distributed is
> equivalent to assuming that the errors have mean zero
> and fixed variance (using the new word I heard today:
> homoscedastic) in that those assumptions least affect
> how close our analysis gets to discerning the
> parameters of interest.

This is totally wrong. Mean zero is important if a constant
term is being estimated; in general, what is important is
homoscedasticity and lack of covariance between the variables
upon which regression is being done and the "errors".

This can happen quite easily without normality. As has been
posted, the estimates behave just about as well without
normality as with, but the traditional tests, with nothing
to support them, may well come out differently.

> only if we are suppressing some knowledge of how the
> errors are distributed beyond the initial assumptions.
> If it somehow turns out that a different set of
> assumptions about the errors is better, for some value
> of better, then that is called scientific discovery,
> not bad assumptions. We should get to the point where
> we cannot wring any more meaning out to the data and
> are left with errors normally distributed around zero.

This is nonsense. If the observations are "good", meaning
the errors in the estimates will be small, the error distribution
will be of little consequence. It is only in poorly fitting
models tbat the distribution may be of consequence.

> It is not that I said anything more than you about the
> mathematics and statistics---only voiced my perspective
> on the process. If you see that I am in error, normal
> for me, I welcome hearing about it.

--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Date Subject Author
1/9/13 Paul
1/9/13 Paul
1/10/13 Ray Koopman
1/10/13 Paul
1/10/13 Ray Koopman
1/10/13 Michael Press
1/10/13 Paul
1/12/13 Herman Rubin
1/10/13 David Jones
1/10/13 Paul