Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
Lurker
Posts:
44
Registered:
12/18/04
|
|
Re: Explanation for why linear regression is a poor fit
Posted:
Feb 15, 2013 10:36 AM
|
|
I take this as asking why statements like "p = 0.025, meaning it is significant at 97.5% confidence" are not true, and how you can demonstrate this.
The p-value depends on (at least approximately) Normal distribution of residuals. Looking at the regression in FC Counts Chart I get: ------------- The regression equation is MPN = 10177 - 0.240 Date
Predictor Coef SE Coef T P Constant 10177 4324 2.35 0.019 Date -0.2396 0.1065 -2.25 0.025
S = 1220.74 R-Sq = 1.0% R-Sq(adj) = 0.8% ----------- Agreeing with Excel, but the residuals are very far from Normally distributed, so you can't trust the P values.
I'm guessing that this is microbiology data (faecal coliforms in water determined by MPN techniques?). In that domain it's common to take logs, usually base 10. Doing the regression on log(MPN) I get ------------------ log(MPN) = - 0.73 + 0.000070 Date
Predictor Coef SE Coef T P Constant -0.726 2.355 -0.31 0.758 Date 0.00007001 0.00005801 1.21 0.228
S = 0.664892 R-Sq = 0.3% R-Sq(adj) = 0.1% ---------------------- Very nicely behaved residuals, so you can trust the P values, which show no evidence for a date dependence.
HTH
KJ On 04/02/2013 20:55, em.derenne@gmail.com wrote: > Hi- > I haven't taken stats in a few years and recently there have been a lot thrown around my work place, including the attached graph (and raw data). I realize that low R2 mean that the linear regression is not a good fit, but it produces a p-value 0.025. I can't formulate a solid argument because I don't understand the material well enough. Am I incorrect in saying this is a poor fit? Even visually to me it looks like a poor fit. Additionally, he says things like: "FC Count at Samish River/Thomas Road: N = 498, r2 = 0.01, p = 0.025, meaning it is significant at 97.5% confidence" I know you can't use P-values to describe stats like this. I need help explaining why this data isn't showing a significant declining trend with a linear regression (in less of course I am incorrect.) > > Thanks for clarification and help. > > Data and Graph: http://dl.dropbox.com/u/18470470/Copy%20of%20Regression%20Correlation%20info.xlsx > >
|
|
|
|