2012/10/29 Guido van Rossum
On Mon, Oct 29, 2012 at 11:08 AM, Giampaolo Rodolà
wrote: 2012/10/29 Guido van Rossum
=== About polling.py === 1 - I think DelayedCall should have a reset() method, other than just cancel().
So, essentially an uncancel()? Why not just re-register in that case? Or what's your use case? (Right now there's no problem in calling one of these many times -- it's just that cancellation is permanent.)
The most common use case is when you want to disconnect the other peer after a certain time of inactivity. Ideally what you would do is schedule() a idle/timeout function and reset() it every time the other peer sends you some data.
2 - EventLoopMixin should have a call_every() method other than just call_later()
Arguably you can emulate that with a simple loop:
def call_every(secs, func, *args): while True: yield from scheduler.sleep(secs) func(*args)
(Flavor to taste to log exceptions, handle cancellation, automatically spawn a separate task, etc.)
I can build lots of other useful things out of call_soon() and call_later() -- but I do need at least those two as "axioms".
Agreed.
3 - call_later() and call_every() should also take **kwargs other than just *args
I just replied to that in a previous message; there's also a comment in the code. How important is this really? Are there lots of use cases that require you to pass keyword args? If it's only on occasion you can use a lambda. (The *args is a compromise so we don't need a lambda to wrap every callback. But I want to reserve keyword args for future extensions to the registration functions.)
It's not crucial to have kwargs, just nice, but I understand your motives to rule them out, in fact I reserved two kwarg names ('_errback' and '_scheduler') for the same reason. In my experience I learned that passing an extra error handler function (what Twisted calls 'errrback') can be desirable, so that's another thing you might want to consider. In my scheduler implementation I achieved that by passing an _errback keyword parameter, like this:
ioloop.call_later(30, callback, _errback=err_callback)
Not very nice to use a reserved keyword, I agree. Perhaps you can keep ruling out kwargs referred to the callback function and change the current call_later signature as such: - def call_later(self, when, callback, *args): + def call_later(self, when, callback, *args, errback=None): ...or maybe provide a DelayedCall.add_errback() method a-la Twisted.
Thanks, I had seen it previously, I think this also proves that there's nothing particularly earth-shattering about this design. :-) I'd love to copy some more of your tricks,
Sure, go on. It's MIT licensed code.
e.g. the occasional re-heapifying. (What usage pattern is this dealing with exactly?)
It's intended to avoid making the list grow with too many cancelled functions. Imagine this use case: WEEK = 60 x 60 x 24 x 7 for x in xrange(1000000): f = call_later(WEEK, fun) f.cancel() You'll end up having a heap with milions of cancelled items which will be freed after a week. Instead you can keep track of the number of cancelled functions every time cancel() is called and re-heapify the list when that number gets too high: http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#122
should also check that I've taken care of all the various flags and other details (I recall being quite surprised that with poll(), on some platforms I need to check for POLLHUP but not on others).
Yeah, that's a painful part. Try to look here: http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#464 Instead of handle_close()ing you should add the fd to the list of readable ones ("r"). The call to recv() which will be coming next will then cause the socket to close (you have to add the error handling to recv() first though).
3 - I don't fully understand how data transfer works exactly but keep in mind that the transport should interact with the pollster. What I mean is that generally speaking a connected socket should *always* be readable ("r"), even when it's idle, then switch to "rw" events when sending data, then get back to "r" when all the data has been sent. This is *crucial* if you want to achieve high performances/scalability and that is why PollsterBase should probably provide a modify() method. Please take a look at what I've done here: http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#809
Hm. I am not convinced that managing this explicitly from the transport is the right solution (note that my transports are quite different from those in Twisted). But I'll keep this in mind -- I would like to set up a benchmark suite at some point. I will probably have to implement the server side of HTTP for that purpose, so I can point e.g. ab at my app.
I think you might want to apply that to something slighlty higher level than the mere transport. Something like the equivalent of asynchat.push / asynchat.push_with_producer, if you'll ever want to go that far in terms of abstraction, or maybe avoid that at all but make it clear in the doc that the user should take care of that. My point is that having a socket registered for both "r" AND "w" events when in fact you want only "r" OR "w" is an exponential waste of CPU cycles and it should be avoided either by the lib or by the user. "old select() implementation" vs "new select() implementation" benchmark shown here reflects exactly this problem which still affects base asyncore module: https://code.google.com/p/pyftpdlib/issues/detail?id=203#c6 I'll keep following the progress on this and hopefully come up with another set of questions and/or random thoughts. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/