Why Does Matrix Multiplication Work the Way It Does?
Date: 09/13/2006 at 22:07:14 From: Casey Subject: Process of Matrix Multiplication Can you please explain WHY the process of matrix multiplication involves multiplying and adding to get each entry? Adding, subtracting and scalar multiplying make sense, but multiplication is complicated and the rationale doesn't make sense to me. I understand the process. I just want to know why it works. Thank you.
Date: 09/16/2006 at 09:53:49 From: Doctor Fenton Subject: Re: Process of Matrix Multiplication Hi Casey, Thanks for writing to Dr. Math. Suppose you have a system of linear equations (for simplicity, I will use two equations in two unknowns, but the principle applies to systems of all sizes). a11*x1 + a12*x2 = d1 a21*x1 + a22*x2 = d2 . (The indices are chosen to indicte the row and column of the coefficient, so a12 is the coefficient in the first equation of the second variable, x2 .) In solving this system by elimination, you can save a lot of writing by using position, instead of writing all the variable names, plus, minus, and equals signs, and carrying out row operations on the augmented matrix [ a11 a12 : d1 ] [ a21 a22 : d2 ] (I have inserted colons to emphasize that the numbers in the last column are different from the other columns, being the data values on the right side of the equations, while the other columns are coefficients of variables.) In this process, the coefficients become entries in the coefficient matrix. Now, suppose we want to make a linear change of variables, so that we introduce new variables y1 and y2 for which x1 = b11*y1 + b12*y2 x2 = b21*y1 + b22*y2 . (Such changes occur if we rotate the coordinate axes, for example, where the y's denote the coordinates in the rotated system.) If you substitute these formulas into the original system of equations, what will the coefficient matrix of the new linear system of equations in the y-variables be? For example, the first equation a11*x1 + a12*x2 = d1 becomes a11*(b11*y1 + b12*y2) + a12*(b21*y1 + b22*y2) = d1 , or, rearranging, (a11*b11 + a12*b21)*y1 + (a11*b12 + a21*b22)*y2 = d1 , so the first row of the new coefficient matrix for the y-variables is [ (a11*b11 + a12*b21) (a11*b12 + a21*b22) ] [ ... ... ] If we designate the coefficient matrix for the new system of equations in the y variables to be the matrix C , where C = [ c11 c12 ] [ c21 c22 ] , then comparing the entries in the two matrices of coefficients, we see that c11 = a11*b11 + a12*b21 and c12 = a11*b12 + a21*b22 . After you have written out the full coefficient matrix for the y-system of equations, compare it with the two coefficient matrices which are the original coefficient matrix for the original system, and the coefficient matrix of the variable transformation: A = [ a11 a12 ] B = [ b11 b12 ] [ a21 a22 ] , [ b21 b22 ] . This type of combination of two matrices is an important operation on matrices, and we call it "matrix multiplication", because we find that it has many properties of what we would expect from a multiplication (although it is not commutative). If you have any questions, please write back and I will try to explain further. - Doctor Fenton, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.