2015-07-29 8:29 GMT+02:00 Sven R. Kunze <srkunze@mail.de>:
Thanks Ludovic.

On 28.07.2015 22:15, Ludovic Gasc wrote:

This discussion is pretty interesting to try to list when each architecture is the most efficient, based on the need.

However, just a small precision: multiprocess/multiworker isn't antinomic with AsyncIO: You can have an event loop in each process to try to combine the "best" of two "worlds".
As usual in IT, it isn't a silver bullet that will care the cancer, however, at least to my understanding, it should be useful for some business needs like server daemons.

I think that should be clear for everybody using any of these modules. But you are right to point it out explicitly.

Based on my discussions at EuroPython and PyCON-US, it's certainly clear for the middle-class management of Python community, however, not really from the typical Python end-dev: Several persons tried to troll me that multiprocessing is more efficient than AsyncIO.

To me, it was a opportunity to transform the negative troll attempt to a positive exchange about efficiency and understand before to troll ;-)
More seriously, I've the feeling that it isn't very clear for everybody, especially for the new comers.
It isn't a crazy new idea, this design pattern is implemented since a long time ago at least in Nginx: http://www.aosabook.org/en/nginx.html

If you are interested in to use this design pattern to build a HTTP server only, you can use easily aiohttp.web+gunicorn: http://aiohttp.readthedocs.org/en/stable/gunicorn.html
If you want to use any AsyncIO server protocol (aiohttp.web, panoramisk, asyncssh, irc3d), you can use API-Hour: http://www.api-hour.io

And if you want to implement by yourself this design pattern, be my guest, if a Python peon like me has implemented API-Hour, everybody on this mailing-list can do that.

For communication between workers, I use Redis, however, you have plenty of solutions to do that.
As usual, before to select a communication mechanism you should benchmark based on your use cases: some results should surprise you.

I hope not to disappoint you.

Don't worry for that, don't hesitate to "hit", I have a very strong shield to avoid disappointments ;-)
I actually strive not to do that manually for each tiny bit of program

You're right, micro-benchmarks isn't a good approach to decide macro architecture of application.
(assuming there are many place in the code base where a project could benefit from concurrency).

As usual, depends on your architecture/need.
If you do a lot of network than CPU usage, the waiting time of network should play for more concurrency.
Personally, I use benchmarks for optimizing problematic code.

But if Python would be able to do that without choosing the right and correctly configured approach (to be determined by benchmarks) that would be awesome. As usual, that needs time to evolve.

It should technically possible, however, I don't believe too much in implicit hidden optimizations to the end-dev: It's very complicated to hide the magic, few people have the skills to implement that, and the day you have an issue, you're almost alone.
See PyPy: certainly one day they will provide a good solution for that, however, it isn't trivial to implement, see the time they need.

With the time, I believe more and more to educate developers who help them to understand the big picture and use explicitly optimizations: The learning curve is more important, however, at the end, you have more autonomous developers who will resolve more problems and less afraid to break the standard frame to innovate.

I don't have scientific proof of that, it's only a feeling.
However, again both approaches aren't antinomic: Each time we have an automagic optimization like computed gotos without side effects, I will use that.

I found that benchmark resulted improvements do not last forever, unfortunately, and that most of the time nobody is able to keep track of everything. So, as soon as something changes, you need to start anew. That is not acceptable for me.

I'm fully agree with you: Until it works, don't break for the pleasure.
Moreover, instead of to trash your full stack for efficiency reasons (For example, drop all your Python code to migrate to Go) where you need to relearn everything, you should maybe first find a solution in your actual stack.
At least to me, it was very less complicated to migrate to Python 3, multiworker pattern and AsyncIO than to migrate to Go/NodeJS/Erlang/...
Moreover, with a niche language, it's more complicated to find developers and harder to spot impostors:
Some people use alternative languages not really used only to try to convince others who are good developers.
Another solution is also to add more servers to handle load, but it isn't always the solution with the smallest TCO, don't forget to count sysadmin costs+complexity to debug when you have an issue on your production.
Btw. that is also a reason why a I said recently (another topic on this list), 'if Python could optimize that without my attention that would be great'. The simplest solution and therefore the easiest to comprehend for all team members is the way to go.

Again, I'm strongly agree with you, however, with the age of Python and the big size of performance community we have (PyPy, Numba, Cython, Pyston...) I believe that less and less automagic solutions without side effects will be find. Not impossible, but harder and harder (I secretly hope that somebody will prove me I was wrong ;-) )
Maybe to "steal" some optimizations from others languages ?
I don't have the technical level to help for that, I'm more a business logic dev than a low level dev.
If that is not efficient enough that is actually a Python issue. Readability counts most. And fortunately, most of the cases that attitude works perfectly with Python. :)

Again and again, I'm agree with you: the combo size of community (big toolbox and a lot of developers) + readability to be newcomer friendly is clearly a big win-win, at least to me.
The only issue I had it was efficiency: with the success of our company, we couldn't be stopped by the programming language/framework to build quickly efficient daemons, it's why I've dropped quickly PHP and Ruby in the past.
Now, with our new stack, based on the trusted predictions of our fortune-telling telephony service department, we could survive a long time before to replace some Python parts with C or other.

Have a nice week-end.

Have a nice week.

PS: Thank you everybody for EuroPython, it was amazing ;-)

Ludovic Gasc (GMLudo)

2015-07-26 23:26 GMT+02:00 Sven R. Kunze <srkunze@mail.de>:
Next update:

Improving Performance by Running Independent Tasks Concurrently - A Survey

processes               | threads                    | coroutines             
purpose        | cpu-bound tasks         | cpu- & i/o-bound tasks     | i/o-bound tasks        
               |                         |                            |                        
managed by     | os scheduler            | os scheduler + interpreter |
customizable event loop
controllable   | no                      | no                         | yes                    
               |                         |                            |                        
parallelism    | yes                     | depends (cf. GIL)          | no                     
switching      | at any time             | after any bytecode         | at user-defined points 
shared state   | no                      | yes                        | yes                    
               |                         |                            |                        
startup impact | biggest/medium*         | medium                     | smallest               
cpu impact**   | biggest                 | medium                     | smallest               
memory impact  | biggest                 | medium                     | smallest               
               |                         |                            |                        
pool module    | multiprocessing.Pool    | multiprocessing.dummy.Pool | asyncio.BaseEventLoop  
solo module    | multiprocessing.Process | threading.Thread           | ---                    

biggest - if spawn (fork+exec) and always on Windows
medium - if fork alone

due to context switching

On 26.07.2015 14:18, Paul Moore wrote:
Just as a note - even given the various provisos and "it's not that
simple" comments that have been made, I found this table extremely
useful. Like any such high-level summary, I expect to have to take it
with a pinch of salt, but I don't see that as an issue - anyone who
doesn't fully appreciate that there are subtleties, probably wouldn't
read a longer explanation anyway.

So many thanks for taking the time to put this together (and for
continuing to improve it).
You are welcome. :)
+1 on something like this ending up in the Python docs somewhere.
Not sure how the process for this is but I think the Python gurus will find a way.

Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/