Multi-threading on Multi-CPU machines

Alex Martelli aleax at aleax.it
Tue Jul 9 05:29:51 EDT 2002


Garry Taylor wrote:
        ...
> that this would make no difference? Do you have any tips/ideas how I
> could make use of multiple processors in a Python program?

Use multiple *processes* rather than multiple *threads* within just
one process.  Multiple processes running on the same machine can
share data very effectively via module mmap (you do need separate
process-synchronization mechanisms if the shared data structures
need to be written, of course), and you can use other fast same-machine
mechanisms such as pipes, in addition of course to general distributed
programming approaches that offer further scalability since they also
run on multiple machines on the same network as well as within a
single machine (pyro, corba, etc etc).  Optimal-performance architectures
will be different for multiple processes than for a single multi-thread
process (and different for really-distributed versus single-machine),
but the key issue tends always to be, who shares / sends what data
with/to whom.  If your problem is highly parallelizable anyway, the
architectural distinction between multithread, multiprocess and
distributed can boil down to using larger "slices" to farm out to
workers to reduce the per-slice communication overhead, sometimes.

Say for example that your task is to perform some pointwise
computation cpuintensivefunction(x) on each point x of some
huge array (assume without loss of generality the array is
one-dimensional -0- the pointwise assumption allows that).

With a multithreaded approach you might keep the array in memory
and have the main thread farm out work requests to worker threads
via a bounded queue.  You want the queue a bit larger than the
number of worker threads, and you can determine the optimal size
for a work request (could be one item, or maybe two, or, say, 4)
via some benchmarking.  Upon receiving a work request from the
Queue, a worker thread would:
        -- get a local copy of the relevant points from the
           large array,
        -- enter the C-coded computation function which
           -- releases the GIL,
           -- does the computations getting the nes points,
           -- acquires the GIL again,
        -- put back the resulting new points to the same area
           of the large array where the input came from,
then go back to peel one more work request from the Queue.

If you can't release the GIL during the computation, e.g.
because your computation is in Python or anyway requires you
to interact with the interpreter, then multithreading will
give no speedup and should not be used for that purpose.

A similar architecture might work for a single-machine multi
process design IF multiple processes can use mmap to read and
write different regions of a shared-memory array at the same
time, without locking (I don't think mmap ensures that on all
platforms, alas).  "Get the next work request" would become
a bit less simple than just peeling an item off a queue, which
makes it likely that a rather larger size of work request
might be optimal -- depending on what guarantees you can count
on for simultaneous reads and writes from/to pipes or message
queues, those might provide the Queue equivalents.

Alternatively, wrap the data with a dedicated process which
knows how to respond to requests for "next still-unassigned
slice of work please" and (no return-acknowledgment needed)
"here's the new computed data for the slice at coordinate X".
pyro might be a good mechanism for such a task, and it would
scale from one multi-CPU running multiple processes to a
network (you might want to build-in sanity checking, most
particularly for the network case -- if a node goes down,
then after a while without a response from it the slices that
had been assigned to it should be farmed out to others...).


Of course, most parallel-computing cases are far more
intricate than simple albeit CPU-intensive computations
on a pointwise basis, but I hope this very elementary
overview can still help!-)


Alex




More information about the Python-list mailing list