Since this is turning into an all purpose post, I'm going to crosspost to sci.math.symbolic. I want to start by saying that the heap method should be called "Johnson's algorithm". See http://portal.acm.org/citation.cfm?id=1086847
We've made contributions to improve it, but our actual work has been to improve the complexity of division and to do parallel algorithms. See http://www.cecm.sfu.ca/~rpearcea/
We're trying to popularize the approach because we think it's better, and the benefit of doing these basic operations well spill over into many other parts of computer algebra systems.
To that end, I put up some example code showing how we implemented heap multiplication. It's important to have the chaining optimization since this reduces the complexity to O(n^2) in the dense case. http://www.cecm.sfu.ca/~rpearcea/heapmul.zip
Also, the inevitable "is this open source" question is sure to pop up. No, it's not. It's just an short example showing how we did it to help anyone doing an implementation. It's also useful to compare against.
As Richard Fateman points out, there are better ways of multiplying polynomials so don't take this as the final word. Dense algorithms are useful, and other sparse algorithms can win in some situations. It may depend on the system and the implementation.
What I like about the algorithm is that it's fairly simple and it gets you to a good division algorithm quickly. This is something which can be a bigger problem since the naive algorithm that does repeated subtractions can be O(n^3) with multivariate polynomials. Other algorithms use O(n^2) memory. The heap is hard to beat.