|


Big-O Notation in Matrix Multiplication
Date: 12/10/2000 at 12:53:41
From: Peter D'Amore
Subject: Matrix multiplication done in big-O notation
Use the algorithm here to show that matrix multiplication can be done
in O(n^k) where k < 3.
[ Z11 Z12 ] [ X11 X12 ] [ Y11 Y12 ]
[ Z21 Z22 ] = [ X21 X22 ] * [ Y21 Y22 ]
Simple algebraic manipulation shows that if we form the intermediate
matrices:
W1 = (X11 + X22) * (Y11 + Y22)
W2 = (X21 + X22) * (Y11)
W3 = (X11) * (Y12 + Y22)
W4 = (X22) * (Y11 + Y21)
W5 = (X11 + X12) * (Y22)
W6 = (X21 - X11) * (Y11 + Y12)
W7 = (X12 - X22) * (Y21 + Y22)
Then
Z11 = W1+W4-W5+W7
Z12 = W3 + W5
Z21 = W2 + W4
Z22 = W1 + W3 - W2 + W6
Please help me solve this question. I'm very familiar with matrix
multiplication, but I've never worked with big-O notation, so any help
would be greatly appreciated.
Pete
Date: 12/12/2000 at 11:26:13
From: Doctor Ian
Subject: Re: Matrix multiplication done in big-O notation
Hi Peter,
Big-O notation is just a way of being able to compare expected run
times without getting bogged down in detail.
Suppose you need to run an algorithm on a set of n inputs. Perhaps the
actual number of 'operations' (however you choose to define that term)
that would need to be carried out is something like:
2n^2 + 3n - 5
This is really only interesting when n gets large, in which case the
only term that really matters is the n^2 term, which dwarfs the
others. And in fact, the constant coefficient isn't really that
important, either. So we just say that the algorithm runs in 'order n
squared' time, or O(n^2).
This notation, then, is used to compare algorithms in terms of speed:
An O(n) algorithm is faster than an O(n^2) algorithm, but slower than
an O(log n) algorithm.
If you multiply two nxn matrices, A and B,
[ a_1,1 a_1,2 ... a_1,n ] [ b_1,1 b_1,2 ... b_1,n ]
[ a_2,1 a_2,2 ... a_2,n ] * [ b_2,1 b_2,2 ... b_2,n ] = ?
. . . . . .
. . . . . .
[ a_n,1 a_n,2 ... a_n,n ] [ b_n,1 b_n,2 ... b_n,n ]
you end up with another nxn matrix, C, where:
c_i,j = (row_i of A) . (col_j of B)
where '.' indicates a 'dot product', which is a sum of pairwise
products. For example:
(x, y, z) . (u, v, w) = x*u + y*v + z*w
So, to compute the first element of C,
c_1,1 = (a_1,1 a_1,2 ... a_1,n) . (b_1,1 b_2,1 ... b_n,1)
= a_1,1*b_1,1 + a_1,2*b_2,1 + ... + a_1,n*b_n,1
So, to construct each element of the output matrix in the most
straightforward possible way requires n multiplications. (Well, more
precisely, it would be 2n operations -- n multiplications, and n
additions. But as I said, when constructing Big-O expressions, we
generally ignore constant coefficients.)
Since there are n^2 elements in the output matrix, that would require
n*n^2 = n^3 operations to compute the entire matrix - that is, the
speed of the most straightforward algorithm is O(n^k), where k = 3.
Your job is to show that it can be done in a more efficient way. You
could do that by trying to invent a new algorithm by yourself, or you
might just point to a paper like this one,
Implementation of Strassen's Algorithm for Matrix Multiplication
http://www.supercomp.org/sc96/proceedings/SC96PROC/JACOBSON/INDEX.HTM
which discusses an algorithm that, at least for some matrices, reduces
the exponent from 3 to 2.807, thus proving that it _can_ be done.
I hope this helps. Write back if you'd like to talk about this some
more, or if you have any other questions.
- Doctor Ian, The Math Forum
http://mathforum.org/dr.math/
|
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]


Ask Dr. MathTM
© 1994-2008 The Math Forum
http://mathforum.org/dr.math/