Paul
Posts:
26
Registered:
1/3/11


Re: Explanation for why linear regression is a poor fit
Posted:
Feb 4, 2013 6:28 PM


On Monday, February 4, 2013 3:55:36 PM UTC5, em.de...@gmail.com wrote:
> I haven't taken stats in a few years and recently there have been a lot thrown around my work place, including the attached graph (and raw data). I realize that low R2 mean that the linear regression is not a good fit,
First, as Dave notes, you have time series data here. Moreover, the spacing of the dates is irregular. If the regression is col/day v. date, I hope whoever ran the regression used the actual dates and not index (1, 2, 3, ...) for the predictor variable. (Also, as Dave mentions, there may be better tools than simple regression given that it's a time series.)
Second, your data has high variance. A low R*2 does not necessarily signal a poor fit (in the sense of incorrect model), although it may signal that the regression model does not have enough predictive power to do you much good. When the data is quite noisy, sometimes a low R^2 is the best you can do (and sometimes the model actually has some value).
> but it produces a pvalue 0.025.
If this is the pvalue of the usual Ftest, all it says is that your trend model fits better than assuming a constant mean. It does not say the trend model is correct (or that a better model cannot be found).
> I can't formulate a solid argument because I don't understand the material well enough. Am I incorrect in saying this is a poor fit? Even visually to me it looks like a poor fit. Additionally, he says things like: "FC Count at Samish River/Thomas Road: N = 498, r2 = 0.01, p = 0.025, meaning it is significant at 97.5% confidence" I know you can't use Pvalues to describe stats like this.
Mixing "significant at" and "confidence" is IMHO sloppy use of terminology, but the underlying intent is not necessarily wrong.
> I need help explaining why this data isn't showing a significant declining trend with a linear regression (in less of course I am incorrect.)
It looks like declining trend to me. Whether the rate is _practically_ significant is an open question. I would not be surprised if it proved to be statistically significant even with a more careful analysis.

