> There's no reason you should be using FFT's with such a small kernel. Also, > you're not exploiting the separability of the Gaussian. The separability allows you to convolve with a 1D Gaussian first along x, then along y, then along z, which is more efficient than consolidating into a 3D convolution kernel.
Ah, I did forget that indeed. To wrapped up in the frequency domain...
Would like to point out that the FFT version made by 'John' is seriously faster than the original convn() approach!
Brilliant. So, what would you (in general, or more specifically for this problem), recommend in terms of speed: do as you've done, combine as many dimensions as possible (Hxy and Hz), or to separate out every dimension (have separate Hx, Hy, and Hz)?