The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » Software » comp.soft-sys.matlab

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Sharing big data across parfor or spmd
Replies: 8   Last Post: Aug 6, 2013 4:48 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]

Posts: 95
Registered: 2/23/10
Re: Sharing big data across parfor or spmd
Posted: Dec 12, 2012 11:23 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Yeah, I found an alternative also, using persistent variables. e.g.

M = <big mxn array>

function Mout = getM(m,n)
persistent M
if isempty(M)
<load M from file>
Mout = M(m,n)

Each worker will load the persistent variable the first time it's called on the read from M (using getM instead of direct access).

This works but doesn't address the problem of having to store a big variable (e.g.) 12 times, using up the associated memory. I was mistaken, in my case my big array is 7.5 GB, so my machine can't even store it that many times over. For now I got around it because the matrix is sparse, so I can keep it around that way instead. Access is a little slower though I think. Still seems like workers on the same machine should be able to access the same memory.


Edric M Ellis <> wrote in message <>...
> "Chuck37 " <> writes:

> > I have a big piece of never changing data that I'd like to be used by
> > all the workers in a parallel setting. Since I'm only working with
> > local workers, I don't understand why I have to eat the huge latency
> > associated with transferring the data to workers each time the parfor
> > loop is called. Can someone explain?
> >
> > I tried to use spmd to send data once at the beginning and have it
> > stay there, but the data is kind of big (~2 GB going to 10 workers),
> > and I got "error during serialization" anyway. Is there a solution to
> > this problem with local workers where they can all access the data
> > from the same memory? Accesses are infrequent, so simultaneous access
> > shouldn't cause a big slowdown.
> >
> > Any ideas would be great.
> >
> > My setup is something like this by the way:
> >
> > M = big data;
> > for x = 1:m
> > stuff
> > parfor y = 1:n
> > a(y) = function(M,a(y));
> > end
> > stuff
> > end
> >
> > Parfor is presently worse than 'for' because of the overhead from sending M every time.

> You could use my worker object wrapper which is designed for exactly
> this sort of situation. See
> <>
> In your case, you could use it like this:
> spmd
> M = <big data>;
> end
> M = WorkerObjWrapper(M);
> for ...
> parfor ...
> a(y) = someFunction(M.Value, ...);
> end
> end
> By building M on the workers directly and building the WorkerObjWrapper
> from the resulting Composite, the data is never actually transmitted
> over the wire at any stage, so you should experience no problems with
> the current 2GB transfer limit.
> Cheers,
> Edric.

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.