The game Dead Dice, where one begins with a fairly large number of dice and rolls them all, removing the sixes, and then rolling again and removing the sixes until you are left with just one die, is a great way to demonstrate exponential decay. If you start with 72 dice, the theoretical decay equation is Y=72*(5/6)^X where Y is the number of dice after X throws. I wrote a short program to simulate this. (Although 72 dice is not all that many--I recommend it as a class activity and give credit to Martha Lowther of the Tatnall School for showing it to me.)
I generated the data and fit an exponential regression equation (with "correlation" r=-.9717). For my data I got an equation of Y=86.9*(.8086)^X. Hmm, that 86.9 looked big and .8086 was not terribly close to 5/6. So I looked at the residuals using both the theoretical equation Y=72*(5/6)^X and the exponential regression equation. For the regression equation the sum of the squares of the residuals was 490.88 and for the theoretical equation the sum of squares was 50.97 - a change by a factor of almost ten!
Of course the exponential regression is not fit to the actual data, but instead is a linear regression fit to (X,lnY) and then transformed back to exponential form. This technique does not minimize the sum of squares of the residuals, sum(Yi-A*B^Xi)^2 , but instead minimizes the sum of the residuals of the log of the data, i.e. it finds a and b that minimizes sum(lnYi-a-blnXi)^2. Then it calculates A=e^a and B=e^b for the equation Y=A*B^X.
It isn't surprising then that this A and B do not minimize the sum of the squares of the residuals for the exponential equation--they weren't calculated that way. Explanations for doing it this way often include (1) it's easy and doable, even by "hand" and (2) minimizing the sum of squares directly leads to equations which are not easily solvable--insoluble in closed form. So I looked at the problem of minimizing the sum of squares (SS from now on) for the exponential equation.
To minimize SS=sum(Yi-A*B^Xi)^2 we need to take the partial derivatives of SS wrt A and B and set them = 0. If we do so, and simplify, remembering that in this context A and B are the variables and the Xi's and Yi's are constants, we get these two equations:
This looks difficult but we can solve it numerically using Newton's method (actually an approximation to Newton's method using nDeriv instead of the actual derivative). If the Xi's are in L1 and the Yi's are in L2, the equation for which we must find a zero is:
(I used X in place of B but if you differentiate wrt B as nDeriv allows you will get the correct answer.)
I stored this in Y1 and used 5/6 as my initial seed value and iterated. Warning--if you try this make sure your batteries are charged. Each iteration can take over a minute of calculation time. This might not sound like a lot, but you'll be surprised.
After determining the value for B I calculated A=sum(Yi*B^Xi)/sum(B^(2Xi)). For the data I was looking at A=71.32699599 and B=.8387668059. Using the equation Y=A*B^X with these values, I then calculated the residuals and found the SS. It was 45.218, smaller then either of the other two.
My final conclusion is that with the power of even a small computer like the TI-82/83 we can solve a problem considered impossible in the past. If our criterion for the "best fitting" equation is one that minimizes the sum of squares of the residuals, why accept a poor approximation? (Time isn't everything.)
The above technique can be applied for PwrReg--fitting and equation of the form Y=A*X^B. It can even be applied to equations of the form Y=A*X^B+C, as in Newton's Law of Cooling experiments. This post is too long already so I leave it to you to do the calculations.
Doug P.S. Thanks to Al Coons for proofreading this.
-- Doug Kuhlmann (508)-749-4242 Phillips Academy firstname.lastname@example.org Andover, MA 01810