> Edric M Ellis <firstname.lastname@example.org> wrote in message <email@example.com>... >> "Eric Zhang" <firstname.lastname@example.org> writes: >> >> > I am working on parallelization MATLAB computing. Strangely enough, my >> > code sometimes works and sometimes fails, although untouched, with the >> > following error. >> > >> > Error using parallel.internal.pool.deserialize (line 9) >> > Bad version or endian-key >> > [...] >> >> Hm, this is definitely not expected. Usually errors like this occur when >> the data transfer between the workers and the MATLAB client is truncated >> or corrupted in some way. >> > Hey Edric, thanks a lot for the reply, but I really tried my best to > create a self-contained code that reproduces this error, but failed, > because it actually involves calling the external software COMSOL. > > Although COMSOL is involved, I still believe this error comes from > MATLAB parallel, because once I change parfor to normal for, it runs > without any errors for days. > > By the way, I am on school's HPC, which means that the several workers > may span over several nodes. Does that matter? After all, it works for > hours before this error pops up,
Are you using an interactive parallel pool to do this, or is everything running on the cluster inside e.g. a 'batch' job? If you are using an interactive pool, it might be worth trying a 'batch' job instead as then there will be no communication from your host to the remote cluster.
If you haven't used it before, the batch reference page is here: