+14

Thank you Andrew for your answer.

@Akira: Measure, profile, and benchmark your projects: learning curve is more complex, however, at the end you'll can filter easier the ideas from the community on your projects.
A lot of "good" practices are counter-efficient like micro-services: if you push micro-services pattern to the extreme, you'll add latency because you'll generate more internal traffic for one HTTP request. It doesn't mean that you must have a monolithic daemon, only to slice pragmatically your services.

I've a concrete example of an open source product that abuses this pattern and where I've measured concrete efficiency impacts before and after microservices introduction. I can't cite his name because we use that on production, I want to keep a good relationship with them.


--
Ludovic Gasc (GMLudo)

2015-08-03 3:08 GMT+02:00 Andrew Barnert via Python-ideas <python-ideas@python.org>:
On Aug 2, 2015, at 10:09, Akira Li <4kir4.1i@gmail.com> wrote:
>
> Ludovic Gasc <gmludo@gmail.com> writes:
>
>> 2015-07-29 8:29 GMT+02:00 Sven R. Kunze <srkunze@mail.de>:
>>
>>> Thanks Ludovic.
>>>
>>> On 28.07.2015 22:15, Ludovic Gasc wrote:
>>>
>>> Hello,
>>>
>>> This discussion is pretty interesting to try to list when each
>>> architecture is the most efficient, based on the need.
>>>
>>> However, just a small precision: multiprocess/multiworker isn't antinomic
>>> with AsyncIO: You can have an event loop in each process to try to combine
>>> the "best" of two "worlds".
>>> As usual in IT, it isn't a silver bullet that will care the cancer,
>>> however, at least to my understanding, it should be useful for some
>>> business needs like server daemons.
>>>
>>>
>>> I think that should be clear for everybody using any of these modules. But
>>> you are right to point it out explicitly.
>>
>> Based on my discussions at EuroPython and PyCON-US, it's certainly clear
>> for the middle-class management of Python community, however, not really
>> from the typical Python end-dev: Several persons tried to troll me that
>> multiprocessing is more efficient than AsyncIO.
>>
>> To me, it was a opportunity to transform the negative troll attempt to a
>> positive exchange about efficiency and understand before to troll ;-)
>> More seriously, I've the feeling that it isn't very clear for everybody,
>> especially for the new comers.
>
> Do you mean those trolls that measure first then make
> conclusions ;)
>
> Could you provide an evidence-based  description of the issue such as
> http://www.mailinator.com/tymaPaulMultithreaded.pdf
> but for Python?

The whole point of that post, and of the older von Behrens paper is references, is that a threading-like API can be built that uses explicit cooperative threading and dynamic stacks, and that avoids all of the problems with threads while retaining almost all of the advantages.

That sounds great. Which is probably why it's exactly what Python asyncio does. Just like von Behrens's thread package, it uses an event loop around poll (or something better) to drive a scheduler for coroutines. The only difference is that Python has coroutines natively, unlike Java or C, and with a nice API, so there's no reason not to hide that API. (But if you really want to, you can just use gevent without its monkeypatching library, and then you've got an almost exact equivalent.)

In other words, in the terms used by mailinator, asyncio is exactly the thread package they suggest using instead of an event package. Their evidence that something like asyncio can be built for Java, and we don't need evidence that something like asyncio could be built for Python because Guido already built it. You could compare asyncio with the coroutine API to asyncio with the lower-level callback API (or Twisted with inline callbacks to Twisted with coroutines, etc.), but what would be the point?

Of course multiprocessing vs. asyncio is a completely different question. Now that we have reasonably similar, well-polished APIs for both, people can start running comparisons. But it's pretty easy to predict what they'll find: for some applications, multiprocessing is better; for others, asyncio is better; for others, a simple combination of the two easily beats either alone; for others, it really doesn't make much difference because concurrency isn't even remotely the key issue. The only thing that really matters to anyone is which is better for _their_ application, and that's something you can't extrapolate from a completely different test any better than you can guess it.
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/