[Python-ideas] PEP 3156 feedback

Wed Dec 19 17:55:02 CET 2012

On Wed, Dec 19, 2012 at 6:51 AM, Giampaolo Rodolà <g.rodola at gmail.com> wrote:
> 2012/12/18 Guido van Rossum <guido at python.org>
>>
>> On Tue, Dec 18, 2012 at 2:01 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:> Event loop API
>> > --------------
>> >
>> > I would like to say that I prefer Tornado's model: for each primitive
>> > provided by Tornado, you can pass an explicit Loop instance which you
>> > instantiated manually.
>> > There is no module function or policy object hiding this mechanism:
>> > it's simple, explicit and flexible (in other words: if you want a
>> > per-thread event loop, just do it yourself using TLS :-)).
>>
>> It sounds though as if the explicit loop is optional, and still
>> defaults to some global default loop?
>>
>> Having one global loop shared by multiple threads is iffy though. Only
>> one thread should be *running* the loop, otherwise the loop can' be
>> used as a mutual exclusion device. Worse, all primitives for adding
>> and removing callbacks/handlers must be made threadsafe, and then
>> basically the entire event loop becomes full of locks, which seems
>> wrong to me.
>
> The basic idea is to have multiple threads/processes, each running its
> own IO loop.

I understand that, and the Tulip implementation supports this. However
different frameworks may have different policies (e.g. AFAIK Twisted
only supports one reactor, period, and it is not threadsafe). I don't
want to put requirements in the PEP that *require* compliant
implementations to support the loop-per-thread model. OTOH I do want
compliant implementations to decide on their own policy. I guess the
minimal requirement for a compliant implementation is that callbacks
associated with the same loop are serialized and never executed
concurrently on different threads.

> No locks are required because each IO poller instance will deal with
> its own socket-map / callbacks-queue and no resources are shared.
> In asyncore this was achieved by introducing the "map" parameter.
> Similarly to Tornado, pyftpdlib uses an "ioloop" parameter which can
> be passed to all the classes which will handle the connection (the
> handlers).

Read the description in the PEP of the event loop policy, or the
default implementation in Tulip. It discourages user code from
creating new event loops (since the framework may not support this)
but does not prevent e.g. unit tests from creating a new loop for each
test (even Twisted supports that).

> If "ioloop" is provided all the handlers will use that (...and
> register() against it, add_reader() etc..) otherwise the "global"
> ioloop instance will be used (default).
> A dynamic IO poller like this is important because in case the
> connection handlers are forced to block for some reason, you can
> switch from a concurrency model (async / non-blocking) to another
> (multi threads/process) very easily.

Did you see run_in_executor() and wrap_future() in the PEP or in the
Tulip implementation? They make it perfectly simple to run something
in another thread (and the default implementation will use this to
call getaddrinfo(), since the stdlib wrappers for it have no async
version. The two APIs are even capable of using a ProcessPoolExecutor.

> See:
> http://code.google.com/p/pyftpdlib/issues/detail?id=212#c9
> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/servers.py?spec=svn1137&r=1137

Of course, if all you want is a server that creates a new thread or
process for each connection, PEP 3156 and Tulip are overkill -- in
that case there's no reason not to use the stdlib's SocketServer
class, which has supported this for over a decade. :-)

-- 
--Guido van Rossum (python.org/~guido)