Big-O Notation in Matrix Multiplication
Date: 12/10/2000 at 12:53:41 From: Peter D'Amore Subject: Matrix multiplication done in big-O notation Use the algorithm here to show that matrix multiplication can be done in O(n^k) where k < 3. [ Z11 Z12 ] [ X11 X12 ] [ Y11 Y12 ] [ Z21 Z22 ] = [ X21 X22 ] * [ Y21 Y22 ] Simple algebraic manipulation shows that if we form the intermediate matrices: W1 = (X11 + X22) * (Y11 + Y22) W2 = (X21 + X22) * (Y11) W3 = (X11) * (Y12 + Y22) W4 = (X22) * (Y11 + Y21) W5 = (X11 + X12) * (Y22) W6 = (X21 - X11) * (Y11 + Y12) W7 = (X12 - X22) * (Y21 + Y22) Then Z11 = W1+W4-W5+W7 Z12 = W3 + W5 Z21 = W2 + W4 Z22 = W1 + W3 - W2 + W6 Please help me solve this question. I'm very familiar with matrix multiplication, but I've never worked with big-O notation, so any help would be greatly appreciated. Pete
Date: 12/12/2000 at 11:26:13 From: Doctor Ian Subject: Re: Matrix multiplication done in big-O notation Hi Peter, Big-O notation is just a way of being able to compare expected run times without getting bogged down in detail. Suppose you need to run an algorithm on a set of n inputs. Perhaps the actual number of 'operations' (however you choose to define that term) that would need to be carried out is something like: 2n^2 + 3n - 5 This is really only interesting when n gets large, in which case the only term that really matters is the n^2 term, which dwarfs the others. And in fact, the constant coefficient isn't really that important, either. So we just say that the algorithm runs in 'order n squared' time, or O(n^2). This notation, then, is used to compare algorithms in terms of speed: An O(n) algorithm is faster than an O(n^2) algorithm, but slower than an O(log n) algorithm. If you multiply two nxn matrices, A and B, [ a_1,1 a_1,2 ... a_1,n ] [ b_1,1 b_1,2 ... b_1,n ] [ a_2,1 a_2,2 ... a_2,n ] * [ b_2,1 b_2,2 ... b_2,n ] = ? . . . . . . . . . . . . [ a_n,1 a_n,2 ... a_n,n ] [ b_n,1 b_n,2 ... b_n,n ] you end up with another nxn matrix, C, where: c_i,j = (row_i of A) . (col_j of B) where '.' indicates a 'dot product', which is a sum of pairwise products. For example: (x, y, z) . (u, v, w) = x*u + y*v + z*w So, to compute the first element of C, c_1,1 = (a_1,1 a_1,2 ... a_1,n) . (b_1,1 b_2,1 ... b_n,1) = a_1,1*b_1,1 + a_1,2*b_2,1 + ... + a_1,n*b_n,1 So, to construct each element of the output matrix in the most straightforward possible way requires n multiplications. (Well, more precisely, it would be 2n operations -- n multiplications, and n additions. But as I said, when constructing Big-O expressions, we generally ignore constant coefficients.) Since there are n^2 elements in the output matrix, that would require n*n^2 = n^3 operations to compute the entire matrix - that is, the speed of the most straightforward algorithm is O(n^k), where k = 3. Your job is to show that it can be done in a more efficient way. You could do that by trying to invent a new algorithm by yourself, or you might just point to a paper like this one, Implementation of Strassen's Algorithm for Matrix Multiplication http://www.supercomp.org/sc96/proceedings/SC96PROC/JACOBSON/INDEX.HTM which discusses an algorithm that, at least for some matrices, reduces the exponent from 3 to 2.807, thus proving that it _can_ be done. I hope this helps. Write back if you'd like to talk about this some more, or if you have any other questions. - Doctor Ian, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.