```Date: Jan 31, 2013 10:29 AM
Author: David Jones
Subject: Re: R^2 for linearized regression

"Darek"  wrote in message news:9362f26d-9e21-46ca-9dd4-accb562606f5@l13g2000yqe.googlegroups.com...Hi all!I would like to ask about R^2 in linearized regression where Y valueis transformed e.g.:http://en.wikipedia.org/wiki/Nonlinear_regression#LinearizationIf we apply power function (Y=a*b^X) for regression in Excel or SPSSthe R^2 (sum of squares etc.) is calculated using linearized functioni.e.: ln(Y)=a+ln(X)I think that comparison of R^2 for the same dataset for variousregression functions (e.g.between linear and power function) where Yis transformed is not proper method of selection of best regressionmodel.I think that in the case described above if we would like to comparevarious functions of regression, R^2 should be calculated usingfunction Y=a*b^X not function after linearization ln(Y)=a+ln(X).Could you give your opinion on this matter?Thanks in advance.Darek======================================It is important to be clear about how the value of R^2 that you use is calculated when you use it. Just using values from individual fitting modules may well not be enough.See http://en.wikipedia.org/wiki/Coefficient_of_determinationYou should try calculating R^2 directly from the sets of observed and corresponding values predicted values, where(i) "observed" is the original observations and "predicted" is either the predictions from linear regression or the exponential  of the predictions from the regression model for the log-ed data (it is also possible to include a "bias adjusted" version of the latter)and(ii) "observed" is the log-ed original observations and "predicted" is either the predictions from linear regression on the log-ed data or the logarithm  of the predictions from the regression model for the original data.This gives at least 4 values to compare. You can also try introducing an additional linear regression step, for example where in (i) you could fit a linear model for the observed data based on the exponentiated  predictions from the linear model for the log-ed observations.If you have time you could construct a pair of scatter plots of observed versus predicted values in both original and transformed spaces.But there is no definite generally applicable answer to your question, except hat you should definitely have a comparison of R^2 values calculated for the same transformation of the observed data. From a theoretical point of view , if the usual model-checks for regression models suggest that the transformed model is better then you should be using the R^2 calculated for the log-ed data. But, if practical/real-world  considerations suggest that the "importance" of errors of prediction is equal on the non-transformed scale, then R^2 calculated for the untransformed observations may be more closely aligned to what you are trying to use the predictions for.David Jones
```