examples of realistic multiprocessing usage?

TomF tomf.sessile at gmail.com
Sun Jan 16 23:39:51 EST 2011


On 2011-01-16 19:16:15 -0800, Dan Stromberg said:

> On Sun, Jan 16, 2011 at 11:05 AM, TomF <tomf.sessile at gmail.com> wrote:
>> I'm trying to multiprocess my python code to take advantage of multiple
>> cores.  I've read the module docs for threading and multiprocessing, and
>> I've done some web searches.  All the examples I've found are too simple:
>> the processes take simple inputs and compute a simple value.  My problem
>> involves lots of processes, complex data structures, and potentially lots of
>> results.  It doesn't map cleanly into a Queue, Pool, Manager or
>> Listener/Client example from the python docs.
>> 
>> Instead of explaining my problem and asking for design suggestions, I'll
>> ask: is there a compendium of realistic Python multiprocessing examples
>> somewhere?  Or an open source project to look at?
> 
> I'm unaware of a big archive of projects that use multiprocessing, but
> maybe one of the free code search engines could help with that.
> 
> It sounds like you're planning to use mutable shared state, which is
> generally best avoided if at all possible, in concurrent programming -
> because mutable shared state tends to slow down things quite a bit.
> 
I'm trying to avoid mutable shared state since I've read the cautions 
against it.  I think it's possible for each worker to compute changes 
and return them back to the parent (and have the parent coordinate all 
changes) without too much overhead.  So far It looks like 
multiprocessing.Pool.apply_async is the best match to what I want.

One difficulty is that there is a queue of work to be done and a queue 
of results to be incorporated back into the parent; there is no 
one-to-one correspondence between the two.  It's not obvious to me how 
to coordinate the queues in a natural way to avoid deadlock or 
starvation.

> 
> But if you must have mutable shared state that's more complex than a
> basic scalar or homogeneous array, I believe the multiprocessing
> module would have you use a "server process manager".

I've looked into Manager but I don't really understand the trade-offs.
-Tom




More information about the Python-list mailing list