> I am processing an image, by using CUDA kernels. I am calling > GPUoutput = feval(........ ), and everything is fine, no error > messages appear. But when I call: "output = gather(GPUoutput);" then > I get an error messagse: > > " Error using parallel.gpu.GPUArray/gather > An unexpected error occurred during CUDA execution. The CUDA error was: > CUDA_ERROR_UNKNOWN. "
All CUDA kernel invocations return before the actual computation is complete, so problems in your kernel can show up later when you try and access the resulting data. I suspect that is what is happening here.
> The more strange fact, is that when I am calling less blocks of > threads to process only part of the image, but still need to gather > back an image of the same size (which is simply partially black) the > problem is gone.
It sounds like when you try to work on the whole image, you may be reading or writing beyond the bounds of the image.