[Python-ideas] Async API: some code to review

Guido van Rossum guido at python.org
Mon Oct 29 23:03:07 CET 2012

On Mon, Oct 29, 2012 at 2:20 PM, Giampaolo Rodolà <g.rodola at gmail.com> wrote:
> 2012/10/29 Guido van Rossum <guido at python.org>:
>> On Mon, Oct 29, 2012 at 11:08 AM, Giampaolo Rodolà <g.rodola at gmail.com> wrote:
>>> 2012/10/29 Guido van Rossum <guido at python.org>
>>> === About polling.py ===
>>> 1 - I think DelayedCall should have a reset() method, other than just cancel().
>> So, essentially an uncancel()? Why not just re-register in that case?
>> Or what's your use case? (Right now there's no problem in calling one
>> of these many times -- it's just that cancellation is permanent.)
> The most common use case is when you want to disconnect the other peer
> after a certain time of inactivity.
> Ideally what you would do is schedule() a idle/timeout function and
> reset() it every time the other peer sends you some data.

Um, ok, I think you are saying that you want to be able to set
timeouts and then "reset" that timeout. This is a much higher-level
thing than canceling the DelayedCall object. (I have no desire to make
DelayedCall have functionality like Twisted's Deferred. It is
something *much* simpler; it's just the API for cancelling a callback
passed to call_later(), and its other uses are similar to this.)

> Not very nice to use a reserved keyword, I agree.
> Perhaps you can keep ruling out kwargs referred to the callback
> function and change the current call_later signature as such:
> -    def call_later(self, when, callback, *args):
> +    def call_later(self, when, callback, *args, errback=None):
> ...or maybe provide a DelayedCall.add_errback() method a-la Twisted.

I really don't want that though! But I'm glad you're not too hell-bent
on supporting callbacks with keyword-only args.

>> should also check that I've taken care of all the various flags and
>> other details (I recall being quite surprised that with poll(), on
>> some platforms I need to check for POLLHUP but not on others).
> Yeah, that's a painful part.
> Try to look here:
> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#464
> Instead of handle_close()ing you should add the fd to the list of
> readable ones ("r").
> The call to recv() which will be coming next will then cause the
> socket to close (you have to add the error handling to recv() first
> though).

Aha, are you suggesting that I close the socket when I detect that the
socket is closed? But what if the other side uses shutdown() to close
only one end? Depending on the protocol it might be useful to either
stop reading but keep sending, or vice versa. Maybe I could detect
that both ends are closed and then close the socket. Or are you
suggesting something else?

>>> 3 - I don't fully understand how data transfer works exactly but keep
>>> in mind that the transport should interact with the pollster.
>>> What I mean is that generally speaking a connected socket should
>>> *always* be readable ("r"), even when it's idle, then switch to "rw"
>>> events when sending data, then get back to "r" when all the data has
>>> been sent.
>>> This is *crucial* if you want to achieve high performances/scalability
>>> and that is why PollsterBase should probably provide a modify()
>>> method.
>>> Please take a look at what I've done here:
>>> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#809
>> Hm. I am not convinced that managing this explicitly from the
>> transport is the right solution (note that my transports are quite
>> different from those in Twisted). But I'll keep this in mind -- I
>> would like to set up a benchmark suite at some point. I will probably
>> have to implement the server side of HTTP for that purpose, so I can
>> point e.g. ab at my app.
> I think you might want to apply that to something slighlty higher
> level than the mere transport.

(Apply *what*?)

> Something like the equivalent of asynchat.push /
> asynchat.push_with_producer, if you'll ever want to go that far in
> terms of abstraction, or maybe avoid that at all but make it clear in
> the doc that the user should take care of that.

I'm actually not sufficiently familiar with asynchat to comment. I
think it's got quite a different model than what I am trying to set up

> My point is that having a socket registered for both "r" AND "w"
> events when in fact you want only "r" OR "w" is an exponential waste
> of CPU cycles and it should be avoided either by the lib or by the
> user.

One task can only be blocked for reading OR writing. The only way to
have a socket registered for both is if there are separate tasks for
reading and writing, and then presumably that is what you want. (I
have a feeling you haven't fully grokked my HTTP client code yet?)

> "old select() implementation" vs "new select() implementation"
> benchmark shown here reflects exactly this problem which still affects
> base asyncore module:
> https://code.google.com/p/pyftpdlib/issues/detail?id=203#c6

Hm, I am already using epoll or kqueue if available, otherwise poll,
falling back to select only if there's nothing else available (in
practice that's only Windows).

But I will diligently work towards a benchmarkable demo.

> I'll keep following the progress on this and hopefully come up with
> another set of questions and/or random thoughts.


--Guido van Rossum (python.org/~guido)

More information about the Python-ideas mailing list