Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: Why doesn't multiple regression maximise R2
Replies: 4   Last Post: Apr 25, 2012 2:22 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Ray Koopman

Posts: 3,362
Registered: 12/7/04
Re: Why doesn't multiple regression maximise R2
Posted: Apr 25, 2012 2:22 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Apr 25, 10:27 am, dc353 <dc...@hotmail.com> wrote:
> On Wednesday, April 25, 2012 10:25:28 AM UTC-4, dc353 wrote:
>> Attached is sample data that produces the following:
>>
>> We regress VOL on STRK_PCT and AVG_VOL and get the following equation:
>>
>> VOL = -.38446 + .004353 * STRK_PCT + 2.963829*AVG_VOL
>>
>> This produces an R2 of .940671
>>
>> We then modify the equation to:
>>
>> VOL = -.38 + .0044*STRK_PCT +2.96*AVG_VOL and calculate the predicated values. The R2 is now .960361
>>
>> My recollection is that a regression minimizes the sum of the squared vertical distances from the observations to line, while R2 is the explained variance divided by the total variance. Is there any reason why minimizing the former maximizes the latter and if so how do we explain the example in the spread sheet?

>
>> DATA [deleted]
>
> How do you scale the estimate to a least square fit?


Suppose you start with a dependent variable y
and several predictors, x1,x2,... .

You solve for the least squares weights b0,b1,b2,...
and form an estimate y' = b0 + b1*x1 + b2*x2 + ... .

var[y']/var[y] = R^2[y,y'],

*because* the least squares weights were used.

Then suppose you pick arbitrary weights a0,a1,a2,...
and form a new estimate y" = a0 + a1*x1 + a2*x2 + ... .

var[y"]/var[y] /= R^2[y,y"],

because the least squares weights were *not* used.

To scale y", regress it on y: get least squares weights c0,c1
and form a third estimate y"' = c0 + c1*y".

var[y"']/var[y] = R^2[y,y"'] = R^2[y,y"] < R^2[y,y'].

(Actually, the intercepts b0,a0,c0 are irrelevant, because they
affect neither the variance nor the correlation. They become relevant
only if you expect the estimates to have the same mean as y.)



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2013. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.