"Harry Commin" wrote in message <email@example.com>... > I cannot repeat your test, as I get an "Out of Memory" error. Furthermore, I require that A1 and A2 change on each iteration. Therefore, the KronProd() operation must also go inside the tic/toc. > > Perhaps you could copy and paste my code (for which I get comparable performance using a simple loop): ==============
OK. Well, if A1 and A2 have to change, then that will obviously increase overhead. However, I still find that the gap between KronProd and the other approaches increases with N1, N2. Here are the results of my tests
Single Loop = 38.3979 secs No Loop = 31.6648 secs Fast KronProd = 7.1126 secs
but with the modified code below. Note that I removed all of the ctranspose operations from the loops, because in reality they are avoidable and contribute unnecessary overhead. Note that kron(A1,A2)' is implemented much faster as kron(A1',A2') and you could construct A1 and A2 as QixNi to avoid the need to transpose. Also, you should not need to transpose the final result, either. If you want the result to be Q2xQ1 instead of Q1xQ2, then you should apply kron(A2,A1) instead of kron(A1,A2). I've incorporated all of these ideas into the modified code.