[Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal

Sun Nov 17 07:31:37 CET 2013

From: Guido van Rossum <guido at python.org>
Sent: Saturday, November 16, 2013 6:56 PM

>Summarizing my understanding of what you're saying, it seems  the "right" way to use IOCP on a multi-core machine is to have one thread per core (barring threads you need for unavoidably blocking stuff) and to let the kernel schedule callbacks on all those threads. As long as the callbacks don't block and events come in at a rate to keep all those cores busy this will be optimal.
>
>But this is almost tautological. It only works if the threads don't communicate with each other or with the main thread (all shared data must be read-only). But heh, if that's all, one process per core works just as well. :-)

I got the same impression from the presentation.

First, I completely agree with the fact that most Unix servers are silly on Windows even in the single-threaded case—simulating epoll on top of single-threaded completion-based GQCS just so you can simulate a completion-based design on top of your simulated ready-based epoll is wasteful and overly complex. But that's a much more minor issue than taking advantage of Windows' integration between threading and async I/O, and one that many server frameworks have already fixed, and that PyParallel isn't necessary for.

I also agree that using IOCP for a multi-threaded proactor instead of a single-threaded reactor plus dispatcher is a huge win in the kinds of shared-memory threaded apps that you can't write in CPython. From my experience building a streaming video server and an IRC-esque interactive communications server, using a reactor plus dispatcher on Windows means one core completely wasted, 40% less performance from the others, and much lower scalability; emulating a proactor on Unix on top of a reactor and dispatcher is around a 10% performance cost (plus a bit of extra code complexity). So, a threaded proactor wins, unless you really don't care about Windows.

But PyParallel doesn't look like it supports such applications any better than stock CPython. As soon as you need to send data from one client to other clients, you're not in a shared-nothing parallel context anymore. Even less extreme cases than streaming video or chat, where all you need is, e.g., shared caching of dynamically-generated data, I don't see how you'd do that in PyParallel.

If you can build a simple multi-user chat server with PyParallel, and show it using all my cores, that would be a lot more compelling.