
Re: Discrepancy in estimating goodness of Poisson GLM
Posted:
Jan 16, 2012 11:02 PM


On Tuesday, 17 January 2012 15:21:29 UTC+13, Rich Ulrich wrote: > On Mon, 16 Jan 2012 12:44:20 0800 (PST), Erogo <eogo...@gmail.com> > wrote: > > >Thanks Rich, that makes sense. > > But you still seem to be missing some points....
No doubt I am and will! Please forgive me there.
> > > >Yes, pseudo R squared statistics were amongst other options, but I haven't looked at those as of yet. I was steered away from them by this paper > > And you might as well stay away from them. I hope you > did not think I was suggesting otherwise.
Rightoh. Two votes against so far!
> ... > > I tried to point out that the "total twaddle" conclusion was based > on an inappropriate analogy. What those answers showed is > that you have not overfit; there is variation left.
But doesn't the goodness of fit test, based on the difference in likelihood, take into account the number of variables used in the model (the degrees of freedom reduction)? Surely, if the model using those (best) explanatory variables is not warranted, it won't be worth adding more, even if variability remains. Maybe there are other explanatory variables, not available to me (or the model fitting process), that would be better?
> And the > "exceptionally good" conclusion is not really appropriate when > you have prescreened variables, and used stepwise selection.
OK, can I downgrade "exceptionally good" to "tentatively accepted"? > But you *do* deserve to place some confidence in extremely > small pvalues for a few univariate relations.
And can I then go forward with those few predictors from univariate relations, (given that they're not colinear, at least as far as I can tell with my limited sample size), and put them together in a multiple linear regression?
> > Can you point me to any further reading on more appropriate methods > > of model selection than stepwise variable selection? I understood > > that it was common practice to use the AIC as an indication of the > > relative appropriateness of two models differing in the (number of) > > explanatory variables used. This almost implies nested models, where > > one is a stepwise modification of the other. > > Well, you have an good test when the models are nested, if the > models were apriori. It is lessgood when you are talking about > somewhere in the list of stepwise comparisons. AIC is especially > for nonnested comparisons, and it adjusts for the number of > predictors.
True. Comparing alternative predictors is not nested, is it. And am I hearing you right that AIC is actually _more_ suitable in the case where you are comparing nonnested alternatives?
> > I appreciate the caveats > > around the expected R^2 achievable from even random explanators when > > their number approaches that of the sample size, but is there any > > better alternative to the numerically enthusiastic ecologist? > > Frank Harrell, whose notes I referred to about stepwise, has a > nice book on Logistic Regression which includes other comments > on model building. > > "Ecologist" reminds me  I have previously found some nice > information on various subject at a website maintained by > ecologists, etc., and Oklahoma State University. You might > check and see if they have anything that relates particularly > to your data. > >  > Rich Ulrich
Thanks again Rich. I'll see if I can make myself a little more dangerous, with a little more knowledge! Off to read up on Harrell and the Oklahoma State University.
Cheers,
Eric

