The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* » sci.stat.math

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: R^2 for linearized regression
Replies: 3   Last Post: Jan 31, 2013 5:18 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
David Jones

Posts: 80
Registered: 2/9/12
Re: R^2 for linearized regression
Posted: Jan 31, 2013 10:29 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

"Darek" wrote in message

Hi all!

I would like to ask about R^2 in linearized regression where Y value
is transformed e.g.:
If we apply power function (Y=a*b^X) for regression in Excel or SPSS
the R^2 (sum of squares etc.) is calculated using linearized function
i.e.: ln(Y)=a+ln(X)

I think that comparison of R^2 for the same dataset for various
regression functions (e.g.between linear and power function) where Y
is transformed is not proper method of selection of best regression
I think that in the case described above if we would like to compare
various functions of regression, R^2 should be calculated using
function Y=a*b^X not function after linearization ln(Y)=a+ln(X).

Could you give your opinion on this matter?

Thanks in advance.



It is important to be clear about how the value of R^2 that you use is
calculated when you use it. Just using values from individual fitting
modules may well not be enough.


You should try calculating R^2 directly from the sets of observed and
corresponding values predicted values, where
(i) "observed" is the original observations and "predicted" is either the
predictions from linear regression or the exponential of the predictions
from the regression model for the log-ed data (it is also possible to
include a "bias adjusted" version of the latter)
(ii) "observed" is the log-ed original observations and "predicted" is
either the predictions from linear regression on the log-ed data or the
logarithm of the predictions from the regression model for the original

This gives at least 4 values to compare. You can also try introducing an
additional linear regression step, for example where in (i) you could fit a
linear model for the observed data based on the exponentiated predictions
from the linear model for the log-ed observations.

If you have time you could construct a pair of scatter plots of observed
versus predicted values in both original and transformed spaces.

But there is no definite generally applicable answer to your question,
except hat you should definitely have a comparison of R^2 values calculated
for the same transformation of the observed data. From a theoretical point
of view , if the usual model-checks for regression models suggest that the
transformed model is better then you should be using the R^2 calculated for
the log-ed data. But, if practical/real-world considerations suggest that
the "importance" of errors of prediction is equal on the non-transformed
scale, then R^2 calculated for the untransformed observations may be more
closely aligned to what you are trying to use the predictions for.

David Jones

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.