Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: Pseudo-R2 for logistic regression
Replies: 1   Last Post: Aug 20, 2013 7:00 PM

 Search Thread: Advanced Search

 Messages: [ Previous | Next ]
 Johann Hibschman Posts: 5 Registered: 4/16/07
Pseudo-R2 for logistic regression
Posted: Aug 19, 2013 2:48 PM
 Plain Text Reply

Is there anything wrong with using

R^2 = 1 - D(M) / D(M0)

as a rough goodness-of-fit measure for logistic regression models, where
D is the deviance, defined as

D(M) = -2 * (log(L(M)) - log(L(M_saturated))

?

I'm sure this is very well-trodden territory, but I keep going in
circles in online searches, and I've not found an answer in my
usual reference, Harrell's Regression Model Strategies.

I've not found this precise form quoted anywhere. McFadden's R^2 uses
the straight log(L(M)), without subtracting off the saturated model
version, while Cox & Snell (and Nagelkerke's version) use L(M)^(2/N)
rather than the log.

I'm just starting to read up on the various distinctions here, but I've
seen it claimed[1] that the Cox & Snell version reduces to the OLS
R^2, but I can't see how that's the case, since

L_normal = Prod_i exp((y - yhat)^2)

and L^(2/N) gives something like the harmonic mean of exp((y-yhat)^2),
which is very different from the log, which just gives the straight
arithmetic sum, same as the OLS R^2.

Clearly, the normalizing coefficients cause some problems, but one nice
thing about my measure is that it divides them out by subtrating the
saturated model log-likelihood.

Basically, I've been naively using the above measure, and now I'm
wondering if I'm setting myself up for problems.

Thanks,

Johann
ignorant practitioner

[1] http://www.statisticalhorizons.com/r2logistic

Date Subject Author
8/19/13 Johann Hibschman
8/20/13 Richard Ulrich

© The Math Forum at NCTM 1994-2017. All Rights Reserved.