[Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication

Thu Jun 16 13:49:27 EDT 2011

Den 16.06.2011 19:23, skrev Robin:
>
> If you are on Linux or Mac then fork works nicely so you have read
> only shared memory you just have to put it in a module before the fork
> (so before pool = Pool() ) and then all the subprocesses can access it
> without any pickling required. ie
> myutil.data = listofdata
> p = multiprocessing.Pool(8)
> def mymapfunc(i):
>      return mydatafunc(myutil.data[i])
>
> p.map(mymapfunc, range(len(myutil.data)))
>

There is still the issue that p.map does not do any load balancing. The 
processes in the pool might spend all the time battling for a mutex. We 
must thus arrange it so each call to mymapfunc processes a chunk of data 
instead of a single item.

This is one strategy that works well:

With n remaining work items and m processes, and c the minimum chunk 
size, let the process holding a mutex grab

    max( c, n/m )

work items. Then n is reduced accordingly, and this continues until all 
items are exchausted.

Also, as processes do not work on interleaved items, we avoid false 
sharing as much as possible. (False sharing means that one processor 
will write to a cache line used by another processor, so both must stop 
what they're doing and synchronize cache with RAM.)

Sturla