[Python-ideas] Async API: some code to review
Guido van Rossum
guido at python.org
Mon Oct 29 19:43:57 CET 2012
On Mon, Oct 29, 2012 at 11:08 AM, Giampaolo Rodolà <g.rodola at gmail.com> wrote:
> 2012/10/29 Guido van Rossum <guido at python.org>
>>
>> I'm most interested in feedback on the design of polling.py and
>> scheduling.py, and to a lesser extent on the design of sockets.py;
>> main.py is just an example of how this style works out in practice.
>
> Follows my comments.
>
> === About polling.py ===
>
> 1 - I think DelayedCall should have a reset() method, other than just cancel().
So, essentially an uncancel()? Why not just re-register in that case?
Or what's your use case? (Right now there's no problem in calling one
of these many times -- it's just that cancellation is permanent.)
> 2 - EventLoopMixin should have a call_every() method other than just
> call_later()
Arguably you can emulate that with a simple loop:
def call_every(secs, func, *args):
while True:
yield from scheduler.sleep(secs)
func(*args)
(Flavor to taste to log exceptions, handle cancellation, automatically
spawn a separate task, etc.)
I can build lots of other useful things out of call_soon() and
call_later() -- but I do need at least those two as "axioms".
> 3 - call_later() and call_every() should also take **kwargs other than
> just *args
I just replied to that in a previous message; there's also a comment
in the code. How important is this really? Are there lots of use cases
that require you to pass keyword args? If it's only on occasion you
can use a lambda. (The *args is a compromise so we don't need a lambda
to wrap every callback. But I want to reserve keyword args for future
extensions to the registration functions.)
> 4 - I think PollsterBase should provide a method to modify() the
> events registered for a certain fd (both poll() and epoll() have such
> a method and it's faster compared to un/registering a fd).
Did you see the concrete implementations? Those where this matters
implicitly uses modify() if the required flags change. I can imagine
more optimizations of the implementations (e.g. delaying
register()/modify() calls until poll() is actually called, to avoid
unnecessary churn) without making the API more complex.
> Feel free to take a look at my scheduler implementation which looks
> quite similar to what you've done in polling.py:
> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#85
Thanks, I had seen it previously, I think this also proves that
there's nothing particularly earth-shattering about this design. :-)
I'd love to copy some more of your tricks, e.g. the occasional
re-heapifying. (What usage pattern is this dealing with exactly?) I
should also check that I've taken care of all the various flags and
other details (I recall being quite surprised that with poll(), on
some platforms I need to check for POLLHUP but not on others).
> === About sockets.py ===
>
> 1 - In SocketTransport it seems there's no error handling provisioned
> for send() and recv().
> You should expect these errors
> http://hg.python.org/cpython/file/95931c48a76f/Lib/asyncore.py#l60
> signaling disconnection plus EWOULDBLOCK and EAGAIN for "retry"
Right, I know have been naive about these and have already got a TODO note.
> 2 - SslTransport's send() and recv() methods should suffer the same problem.
Ditto, Antoine told me.
> 3 - I don't fully understand how data transfer works exactly but keep
> in mind that the transport should interact with the pollster.
> What I mean is that generally speaking a connected socket should
> *always* be readable ("r"), even when it's idle, then switch to "rw"
> events when sending data, then get back to "r" when all the data has
> been sent.
> This is *crucial* if you want to achieve high performances/scalability
> and that is why PollsterBase should probably provide a modify()
> method.
> Please take a look at what I've done here:
> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#809
Hm. I am not convinced that managing this explicitly from the
transport is the right solution (note that my transports are quite
different from those in Twisted). But I'll keep this in mind -- I
would like to set up a benchmark suite at some point. I will probably
have to implement the server side of HTTP for that purpose, so I can
point e.g. ab at my app.
> === Other considerations ===
>
> This 'yield' / 'yield from' approach is new to me (I'm more of a
> "callback guy") so I can't say I fully understand what's going on just
> by reading the code.
Fair enough. You should probably start by reading Greg Ewing's
tutorial -- it's short and sweet:
http://www.cosc.canterbury.ac.nz/greg.ewing/python/tasks/SimpleScheduler.html
> What I would like to see instead of main.py is a bunch of code samples
> / demos showing how this library is supposed to be used in different
> circumstances.
Agreed, more examples are needed.
> In details I'd like to see at least:
>
> 1 - a client example (connect(), send() a string, recv() a response, close())
Hm, that's all in urlfetch().
> 2 - an echo server example (accept(), recv() string, send() it back(), close()
Yes, that's missing.
> 3 - how to use a different transport (e.g. UDP)?
I haven't looked into this yet. I expect I'll have to write a
different SocketTransport for this (the existing transports are
implicitly stream-oriented) but I know that the scheduler and
eventloop implementation can handle this fine.
> 4 - how to run long running tasks in a thread?
That's implemented. Check out call_in_thread(). Note that you can pass
it an alternate threadpool (executor).
> Also:
>
> 5 - is it possible to use multiple "reactors" in different threads?
Should be possible.
> How? (asyncore for example achieves this by providing a separate
> 'map' argument for both the 'reactor' and the dispatchers)
It works by making the Context class use thread-local storage (TLS).
> I understand you just started with this so I'm probably asking too
> much at this point in time.
> Feel free to consider this a kind of a "long term review".
You have asked many useful questions already. Since you have
implemented a real-world I/O loop yourself, your input is extremely
valuable. Thanks, and keep at it!
--
--Guido van Rossum (python.org/~guido)
More information about the Python-ideas
mailing list