The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* » sci.stat.math

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Explanation for why linear regression is a poor fit
Replies: 8   Last Post: Feb 15, 2013 10:36 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]

Posts: 26
Registered: 1/3/11
Re: Explanation for why linear regression is a poor fit
Posted: Feb 4, 2013 6:28 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Monday, February 4, 2013 3:55:36 PM UTC-5, wrote:

> I haven't taken stats in a few years and recently there have been a lot thrown around my work place, including the attached graph (and raw data). I realize that low R2 mean that the linear regression is not a good fit,

First, as Dave notes, you have time series data here. Moreover, the spacing of the dates is irregular. If the regression is col/day v. date, I hope whoever ran the regression used the actual dates and not index (1, 2, 3, ...) for the predictor variable. (Also, as Dave mentions, there may be better tools than simple regression given that it's a time series.)

Second, your data has high variance. A low R*2 does not necessarily signal a poor fit (in the sense of incorrect model), although it may signal that the regression model does not have enough predictive power to do you much good. When the data is quite noisy, sometimes a low R^2 is the best you can do (and sometimes the model actually has some value).

> but it produces a p-value 0.025.

If this is the p-value of the usual F-test, all it says is that your trend model fits better than assuming a constant mean. It does not say the trend model is correct (or that a better model cannot be found).

> I can't formulate a solid argument because I don't understand the material well enough. Am I incorrect in saying this is a poor fit? Even visually to me it looks like a poor fit. Additionally, he says things like: "FC Count at Samish River/Thomas Road: N = 498, r2 = 0.01, p = 0.025, meaning it is significant at 97.5% confidence" I know you can't use P-values to describe stats like this.

Mixing "significant at" and "confidence" is IMHO sloppy use of terminology, but the underlying intent is not necessarily wrong.

> I need help explaining why this data isn't showing a significant declining trend with a linear regression (in less of course I am incorrect.)

It looks like declining trend to me. Whether the rate is _practically_ significant is an open question. I would not be surprised if it proved to be statistically significant even with a more careful analysis.

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.