[CentralOH] embarrassingly parallel loops question

Fri Jul 29 13:44:39 EDT 2016

Hello Group,

So I have this embarrassingly parallel number crunching i'm trying to do in
a for loop.  Each iteration there is some crunching that is independent of
all other iterations, so I was able to set this up pretty easy using a
multiprocessing pool.  (Side detail, each iteration depends on some common
data structures that I make global and gives me the fastest cruch time
versus passing to each thread explicitly).  Takes about 30ms to run:

import multiprocessing
pool = multiprocessing.Pool( numCores)
results = pool.map( crunchFunctionIter, xrange(len(setN)))

Running on 1 core, tiny slowdown (~5ms overhead, ~35 ms to run)
Running on 2 cores I get about a 2x speedup which is great and expected (
~18ms to run).
But the speedup saturates there and I can't get more juice even when upping
to 4 or 6 cores.

The thing is, all iterations are pretty much independent so I don't see why
in theory I don't get close to a linear speedup.  Or at least an (N-1)
speedup.  My guess is there is something weird with the memory sharing that
is causing unnecessary overhead.  Another colleague doing a similar
embarrassingly parallel problem saw the same saturation at about 2 cores.

Any thoughts on what is going on, or what I need to do to make this
embarrassingly parallel thing speedup linearly?  Should I just use a
different library and set up my data structures a different way?

Thanks,
Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/centraloh/attachments/20160729/14fb35f4/attachment.html>