as a rough goodness-of-fit measure for logistic regression models, where D is the deviance, defined as
D(M) = -2 * (log(L(M)) - log(L(M_saturated))
I'm sure this is very well-trodden territory, but I keep going in circles in online searches, and I've not found an answer in my usual reference, Harrell's Regression Model Strategies.
I've not found this precise form quoted anywhere. McFadden's R^2 uses the straight log(L(M)), without subtracting off the saturated model version, while Cox & Snell (and Nagelkerke's version) use L(M)^(2/N) rather than the log.
I'm just starting to read up on the various distinctions here, but I've seen it claimed that the Cox & Snell version reduces to the OLS R^2, but I can't see how that's the case, since
L_normal = Prod_i exp((y - yhat)^2)
and L^(2/N) gives something like the harmonic mean of exp((y-yhat)^2), which is very different from the log, which just gives the straight arithmetic sum, same as the OLS R^2.
Clearly, the normalizing coefficients cause some problems, but one nice thing about my measure is that it divides them out by subtrating the saturated model log-likelihood.
Basically, I've been naively using the above measure, and now I'm wondering if I'm setting myself up for problems.