Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



PseudoR2 for logistic regression
Posted:
Aug 19, 2013 2:48 PM


Is there anything wrong with using
R^2 = 1  D(M) / D(M0)
as a rough goodnessoffit measure for logistic regression models, where D is the deviance, defined as
D(M) = 2 * (log(L(M))  log(L(M_saturated))
?
I'm sure this is very welltrodden territory, but I keep going in circles in online searches, and I've not found an answer in my usual reference, Harrell's Regression Model Strategies.
I've not found this precise form quoted anywhere. McFadden's R^2 uses the straight log(L(M)), without subtracting off the saturated model version, while Cox & Snell (and Nagelkerke's version) use L(M)^(2/N) rather than the log.
I'm just starting to read up on the various distinctions here, but I've seen it claimed[1] that the Cox & Snell version reduces to the OLS R^2, but I can't see how that's the case, since
L_normal = Prod_i exp((y  yhat)^2)
and L^(2/N) gives something like the harmonic mean of exp((yyhat)^2), which is very different from the log, which just gives the straight arithmetic sum, same as the OLS R^2.
Clearly, the normalizing coefficients cause some problems, but one nice thing about my measure is that it divides them out by subtrating the saturated model loglikelihood.
Basically, I've been naively using the above measure, and now I'm wondering if I'm setting myself up for problems.
Thanks,
Johann ignorant practitioner
[1] http://www.statisticalhorizons.com/r2logistic



