Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.matlab

Topic: Sharing big data across parfor or spmd
Replies: 8   Last Post: Aug 6, 2013 4:48 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Chuck37

Posts: 93
Registered: 2/23/10
Re: Sharing big data across parfor or spmd
Posted: Dec 16, 2012 12:08 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

If you can stand storing it N times, then my persistent variable trick seems to work. See my previous post in this thread.

"Haoran Xu" <haoran.x@gmail.com> wrote in message <kakd6h$elq$1@newscl01ah.mathworks.com>...
> The spmd error is:
> Error using distcompserialize
> Error during serialization
> So is this because the data is too large? Then how to solve this?
> Edric M Ellis <eellis@mathworks.com> wrote in message <ytwy5h3d5q1.fsf@uk-eellis0l.dhcp.mathworks.com>...

> > "Chuck37 " <chuck3737@yahooremovethis.com> writes:
> >

> > > I have a big piece of never changing data that I'd like to be used by
> > > all the workers in a parallel setting. Since I'm only working with
> > > local workers, I don't understand why I have to eat the huge latency
> > > associated with transferring the data to workers each time the parfor
> > > loop is called. Can someone explain?
> > >
> > > I tried to use spmd to send data once at the beginning and have it
> > > stay there, but the data is kind of big (~2 GB going to 10 workers),
> > > and I got "error during serialization" anyway. Is there a solution to
> > > this problem with local workers where they can all access the data
> > > from the same memory? Accesses are infrequent, so simultaneous access
> > > shouldn't cause a big slowdown.
> > >
> > > Any ideas would be great.
> > >
> > > My setup is something like this by the way:
> > >
> > > M = big data;
> > > for x = 1:m
> > > stuff
> > > parfor y = 1:n
> > > a(y) = function(M,a(y));
> > > end
> > > stuff
> > > end
> > >
> > > Parfor is presently worse than 'for' because of the overhead from sending M every time.

> >
> > You could use my worker object wrapper which is designed for exactly
> > this sort of situation. See
> >
> > <http://www.mathworks.com/matlabcentral/fileexchange/31972-worker-object-wrapper>
> >
> > In your case, you could use it like this:
> >
> > spmd
> > M = <big data>;
> > end
> > M = WorkerObjWrapper(M);
> >
> > for ...
> > parfor ...
> > a(y) = someFunction(M.Value, ...);
> > end
> > end
> >
> > By building M on the workers directly and building the WorkerObjWrapper
> > from the resulting Composite, the data is never actually transmitted
> > over the wire at any stage, so you should experience no problems with
> > the current 2GB transfer limit.
> >
> > Cheers,
> >
> > Edric.




Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.