Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.matlab

Topic: "Bad version or endian-key"?
Replies: 1   Last Post: Jul 28, 2014 3:44 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View  
Edric Ellis

Posts: 692
Registered: 12/7/04
Re: "Bad version or endian-key"?
Posted: Jul 28, 2014 3:44 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

"Eric Zhang" <ericzhangxiuming@gmail.com> writes:

> Edric M Ellis <eellis@mathworks.com> wrote in message <ytwoawhanxq.fsf@uk-eellis-deb7-64.dhcp.mathworks.com>...
>> "Eric Zhang" <ericzhangxiuming@gmail.com> writes:
>>

>> > I am working on parallelization MATLAB computing. Strangely enough, my
>> > code sometimes works and sometimes fails, although untouched, with the
>> > following error.
>> >
>> > Error using parallel.internal.pool.deserialize (line 9)
>> > Bad version or endian-key
>> > [...]

>>
>> Hm, this is definitely not expected. Usually errors like this occur when
>> the data transfer between the workers and the MATLAB client is truncated
>> or corrupted in some way.
>>

> Hey Edric, thanks a lot for the reply, but I really tried my best to
> create a self-contained code that reproduces this error, but failed,
> because it actually involves calling the external software COMSOL.
>
> Although COMSOL is involved, I still believe this error comes from
> MATLAB parallel, because once I change parfor to normal for, it runs
> without any errors for days.
>
> By the way, I am on school's HPC, which means that the several workers
> may span over several nodes. Does that matter? After all, it works for
> hours before this error pops up,


Are you using an interactive parallel pool to do this, or is everything
running on the cluster inside e.g. a 'batch' job? If you are using an
interactive pool, it might be worth trying a 'batch' job instead as then
there will be no communication from your host to the remote cluster.

If you haven't used it before, the batch reference page is here:

<http://www.mathworks.com/help/distcomp/batch.html>

and you'll want to do something like

c = parcluster(...); % get your HPC cluster
j = batch(c, @myFunction, 2, {args}, 'Pool', 15);

where 'myFunction' contains your PARFOR loops etc.

Cheers,

Edric.



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.