[Python-ideas] PEP 3156 feedback

Tue Dec 18 11:01:36 CET 2012

Hello,

Here is my own feedback on the in-progress PEP 3156. Please discard it
if it's too early to give feedback :-))

Event loop API
--------------

I would like to say that I prefer Tornado's model: for each primitive
provided by Tornado, you can pass an explicit Loop instance which you
instantiated manually.
There is no module function or policy object hiding this mechanism:
it's simple, explicit and flexible (in other words: if you want a
per-thread event loop, just do it yourself using TLS :-)).

There are some requirements I've found useful:

- being able to instantiate multiple loops, either at the same time or
  serially (this is especially nice for unit tests; Twisted has to use
  a dedicated test runner just because their reactor doesn't support
  multiple instances or restarts)

- being able to stop a loop explicitly: having to unregister all
  handlers or delayed calls is a PITA in non-trivial situations (for
  example you might have multiple protocol instances, each with a bunch
  of timers, some perhaps even in third-party libraries; keeping track
  of all this is the event loop's job)

* The optional sock_*() methods: how about having different ABCs, e.g.
  the EventLoop ABC for basic behaviour, and the NetworkedEventLoop ABC
  adding the socket helpers?

Protocols and transports
------------------------

We probably want to provide a Protocol base class and encourage people
to inherit it. It can provide useful functionality (perhaps write()
and writelines() shims? it can make mocking easier).

My own opinion about Twisted's API is that the Factory class is often
useless, and adds a cognitive burden. If you need a place to track all
protocols of a given kind (e.g. all connections), you can do it
yourself. Also, the Factory implies that you don't control how exactly
your protocol gets instantiated (unless you override some method on the
Factory I'm missing the name of: it is cumbersome).

So, when creating a client, I would pass it a protocol instance.
When creating a server, I would pass it a protocol class. Here the base
Protocol class comes into play, its __init__() could take the transport
as argument and set the "transport" attribute with it. Further args
could be optionally passed to the constructor:

class MyProtocol(Protocol):
    def __init__(self, transport, my_personal_attribute):
        Protocol.__init__(self, transport)
        self.my_personal_attribute = my_personal_attribute
    ...

def listen(ioloop):
    # Each new connection will instantiate a MyProtocol with "foobar"
    # for my_personal_attribute.
    ioloop.listen_tcp(("0.0.0.0", 8080), MyProtocol, "foobar")

(The hypothetical listen_tcp() is just a name: perhaps it's actually
start_serving(). It should accept any callable, not just a class:
therefore, you can define complex behaviour if you like)

I think the transport / protocol registration must be done early, not in
connection_made(). Sometimes you will want to do things on a protocol
before you know a connection is established, for example queue things
to write on the transport. An use case is a reconnecting TCP client:
the protocol will continue existing at times when the connection is
down.

Unconnected protocols need their own base class and API:
data_received()'s signature should be (data, remote_addr) or
(remote_addr, data). Same for write().

* writelines() sounds ambiguous for datagram protocols: does it send
  those "lines" as a single datagram, or one separate datagram per
  "line"? The equivalent code suggests the latter, but which one makes
  more sense?

* connection_lost(): you definitely want to know whether it's you or the
  other end who closed the connection. Typically, if the other end
  closed the connection, you will have to run some cleanup steps, and
  perhaps even log an error somewhere (if the connection was closed
  unexpectedly).
  Actually, I'm not sure it's useful to call connection_lost() when you
  closed the connection yourself: are there any use cases?

Regards

Antoine.