[Python-ideas] Concurrency Modules

Sun Jul 26 12:07:05 CEST 2015

Thanks, Nikolaus. Mostly I refer to things Steve brought up in his 
analogies (two recent posts). So, I might interpreted them the wrong way.

On 26.07.2015 02:58, Nikolaus Rath wrote:
> On Jul 25 2015, "Sven R. Kunze" <srkunze-7y4VAllY4QU at public.gmane.org> wrote:
>> startup impact | biggest                 | medium                     | smallest
>> cpu impact     | biggest                 | medium                     | smallest
>> memory impact  | biggest                 | medium                     | smallest
>> purpose        | cpu-bound tasks         | i/o-bound tasks            | ???
> I don't think any of these is correct. Unfortunately, I also don't think
> there even is a correct version, the differences are simply not so
> clear-cut.
I think that has already been discussed. We just try to boil it down to 
assist people making the decision of which module might be the best for 
them.
> On Unix, Process startup-cost can be high if you do fork() + exec(), but
> if you just fork, it's as cheap as a thread.
Didn't know that. Thanks for clarifying. How do multiprocessing.Pool and 
multiprocessing.Process work in this regard?
> With asyncio, it's not
> clear to me what exactly you'd define as the "startup impact" (the
> creation of a future maybe? Or setting up the event loop?).
The purpose of survey is to give developers an easy way to decide which 
approach might be suitable for them.
So, the definition of 'startup time' should be roughly equivalent across 
the approaches. >> What's necessary to get a process up and running a 
piece of code compared to what's necessary to get asyncio up and running 
the same piece of code.

Steve: "Bakers aren't free, you have to pay for each one (memory, stack 
space), it will take time for each one to learn how your bakery works 
(startup time)"
> "CPU impact" as a category doesn't make any sense to me. If you execute
> the same code it's going to take the same amount of (cumulative) CPU
> time, no matter if this code runs in a separate thread, separate
> process, or asynchronously.
 From what I understand, switching contexts impacts cpu whereas the 
event loop does not so much.
> "memory impact" is probably highest for separate processes, but I don't
> see an obvious difference when using threads vs asyncio. Where did you
> get this from?
I can imagine that when the os needs to manage threads it creates more 
overhead for each thread than what it takes for the Python interpreter 
when suspending coroutines. That could be wrong? Do you have any 
material on this?
> As far as purpose is concerned, pretty much the only limitation is that
> asyncio is not suitable for cpu-bound tasks. Any other combination is
> possible and also most appropriate in specific circumstances.
What exactly do you mean by any other combination?

I take from this that asyncio is suitable for heavy i/o-bound, threads 
are for cpu/io-bound and processes for mainly cpu-bound.

Best,
Sven