Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
Chuck37
Posts:
84
Registered:
2/23/10
|
|
Re: Sharing big data across parfor or spmd
Posted:
Dec 12, 2012 11:23 AM
|
|
Yeah, I found an alternative also, using persistent variables. e.g.
M = <big mxn array>
function Mout = getM(m,n) persistent M if isempty(M) <load M from file> end Mout = M(m,n) end
Each worker will load the persistent variable the first time it's called on the read from M (using getM instead of direct access).
This works but doesn't address the problem of having to store a big variable (e.g.) 12 times, using up the associated memory. I was mistaken, in my case my big array is 7.5 GB, so my machine can't even store it that many times over. For now I got around it because the matrix is sparse, so I can keep it around that way instead. Access is a little slower though I think. Still seems like workers on the same machine should be able to access the same memory.
Thanks.
Edric M Ellis <eellis@mathworks.com> wrote in message <ytwy5h3d5q1.fsf@uk-eellis0l.dhcp.mathworks.com>... > "Chuck37 " <chuck3737@yahooremovethis.com> writes: > > > I have a big piece of never changing data that I'd like to be used by > > all the workers in a parallel setting. Since I'm only working with > > local workers, I don't understand why I have to eat the huge latency > > associated with transferring the data to workers each time the parfor > > loop is called. Can someone explain? > > > > I tried to use spmd to send data once at the beginning and have it > > stay there, but the data is kind of big (~2 GB going to 10 workers), > > and I got "error during serialization" anyway. Is there a solution to > > this problem with local workers where they can all access the data > > from the same memory? Accesses are infrequent, so simultaneous access > > shouldn't cause a big slowdown. > > > > Any ideas would be great. > > > > My setup is something like this by the way: > > > > M = big data; > > for x = 1:m > > stuff > > parfor y = 1:n > > a(y) = function(M,a(y)); > > end > > stuff > > end > > > > Parfor is presently worse than 'for' because of the overhead from sending M every time. > > You could use my worker object wrapper which is designed for exactly > this sort of situation. See > > <http://www.mathworks.com/matlabcentral/fileexchange/31972-worker-object-wrapper> > > In your case, you could use it like this: > > spmd > M = <big data>; > end > M = WorkerObjWrapper(M); > > for ... > parfor ... > a(y) = someFunction(M.Value, ...); > end > end > > By building M on the workers directly and building the WorkerObjWrapper > from the resulting Composite, the data is never actually transmitted > over the wire at any stage, so you should experience no problems with > the current 2GB transfer limit. > > Cheers, > > Edric.
|
|
|
|