[Python-Dev] multiprocessing vs. distributed processing

James Mills prologic at shortcircuit.net.au
Fri Jan 16 09:56:04 CET 2009


On Fri, Jan 16, 2009 at 6:30 PM, Matthieu Brucher
<matthieu.brucher at gmail.com> wrote:
> 2009/1/16 James Mills <prologic at shortcircuit.net.au>:
>> I've noticed over the past few weeks lots of questions
>> asked about multi-processing (including myself).
>
> Funny, I was going to blog about this, but not just for Python.
>
>> For those of you new to multi-processing, perhaps this
>> thread may help you. Some things I want to start off
>> with to point out are:
>>
>> "multiprocessing will not always help you get things done faster."
>
> Of course. There are some programs that are I/O or memory bandwidth
> bound. So if one of those bottlenecks is common to the cores you use,
> you can't benefit from their use.
>
>> "be aware of I/O bound applications vs. CPU bound"
>
> Exactly. We read a lot about Folding at Home, SETI at Home, they can be
> distributed, as it is more or less "take a chunk, process it somewhere
> and when you're finish tell me if there something interesting in it".
> Not a lot of communications between the nodes. Then, there are other
> applications that process a lot of data, they must read data from
> memory, make one computation, read other data, compute a little bit
> (finite difference schemes), and here we are memory bandwidth bound,
> not CPU bound.
>
>> "multiple CPUs (cores) can compute multiple concurrent expressions -
>> not read 2 files concurrently"
>
> Let's say that this is true for the usual computers. Clusters can make
> concurrent reads, as long as there is the correct architecture behind.
> Of course, if you only have one hard disk, you are limited.
>
>> "in some cases, you may be after distributed processing rather than
>> multi or parallel processing"
>
> Of course. Clusters can be expensive, their interconnections even
> more. So if your application is made of independent blocks that can
> run on small nodes, without much I/Os, you can try distributed
> computing. If you need big nodes with high-speed interconnections, you
> will have to use parallel processing.
>
> This is just what my thoughts on the sucjet are, but I think I'm not
> far from the truth. Of course, if I'm proved wrong, I'll be glad to
> hear why.

Thank you Matthieu for your  response.
Very good comments on some of the points
I raised. Hopefully those interested in the topic
will learn from this thread.

cheers
James

PS: I assumed you meant to post back to the list and not just me :)



More information about the Python-list mailing list