On Mon, Oct 29, 2012 at 8:43 PM, Guido van Rossum <guido@python.org> wrote:
On Mon, Oct 29, 2012 at 11:08 AM, Giampaolo RodolĂ <g.rodola@gmail.com> wrote:
2012/10/29 Guido van Rossum <guido@python.org>
I'm most interested in feedback on the design of polling.py and scheduling.py, and to a lesser extent on the design of sockets.py; main.py is just an example of how this style works out in practice.
Follows my comments.
=== About polling.py ===
1 - I think DelayedCall should have a reset() method, other than just cancel().
So, essentially an uncancel()? Why not just re-register in that case? Or what's your use case? (Right now there's no problem in calling one of these many times -- it's just that cancellation is permanent.)
2 - EventLoopMixin should have a call_every() method other than just call_later()
Arguably you can emulate that with a simple loop:
def call_every(secs, func, *args): while True: yield from scheduler.sleep(secs) func(*args)
(Flavor to taste to log exceptions, handle cancellation, automatically spawn a separate task, etc.)
I can build lots of other useful things out of call_soon() and call_later() -- but I do need at least those two as "axioms".
3 - call_later() and call_every() should also take **kwargs other than just *args
I just replied to that in a previous message; there's also a comment in the code. How important is this really? Are there lots of use cases that require you to pass keyword args? If it's only on occasion you can use a lambda. (The *args is a compromise so we don't need a lambda to wrap every callback. But I want to reserve keyword args for future extensions to the registration functions.)
Well, using keyword-only arguments for passing flags can be good point. I can live with *args only. Maybe using **kwargs for call_later family only is good compromise? Really I don't care on add_reader/add_writer, that functions intended to library writers. call_later and call_soon can be used in user code often enough and passing keyword arguments can be convenient.
4 - I think PollsterBase should provide a method to modify() the events registered for a certain fd (both poll() and epoll() have such a method and it's faster compared to un/registering a fd).
Did you see the concrete implementations? Those where this matters implicitly uses modify() if the required flags change. I can imagine more optimizations of the implementations (e.g. delaying register()/modify() calls until poll() is actually called, to avoid unnecessary churn) without making the API more complex.
Feel free to take a look at my scheduler implementation which looks quite similar to what you've done in polling.py: http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#85
Thanks, I had seen it previously, I think this also proves that there's nothing particularly earth-shattering about this design. :-) I'd love to copy some more of your tricks, e.g. the occasional re-heapifying. (What usage pattern is this dealing with exactly?) I should also check that I've taken care of all the various flags and other details (I recall being quite surprised that with poll(), on some platforms I need to check for POLLHUP but not on others).
=== About sockets.py ===
1 - In SocketTransport it seems there's no error handling provisioned for send() and recv(). You should expect these errors http://hg.python.org/cpython/file/95931c48a76f/Lib/asyncore.py#l60 signaling disconnection plus EWOULDBLOCK and EAGAIN for "retry"
Right, I know have been naive about these and have already got a TODO note.
2 - SslTransport's send() and recv() methods should suffer the same problem.
Ditto, Antoine told me.
3 - I don't fully understand how data transfer works exactly but keep in mind that the transport should interact with the pollster. What I mean is that generally speaking a connected socket should *always* be readable ("r"), even when it's idle, then switch to "rw" events when sending data, then get back to "r" when all the data has been sent. This is *crucial* if you want to achieve high performances/scalability and that is why PollsterBase should probably provide a modify() method. Please take a look at what I've done here: http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py?spec=svn1115&r=1115#809
Hm. I am not convinced that managing this explicitly from the transport is the right solution (note that my transports are quite different from those in Twisted). But I'll keep this in mind -- I would like to set up a benchmark suite at some point. I will probably have to implement the server side of HTTP for that purpose, so I can point e.g. ab at my app.
=== Other considerations ===
This 'yield' / 'yield from' approach is new to me (I'm more of a "callback guy") so I can't say I fully understand what's going on just by reading the code.
Fair enough. You should probably start by reading Greg Ewing's tutorial -- it's short and sweet: http://www.cosc.canterbury.ac.nz/greg.ewing/python/tasks/SimpleScheduler.htm...
What I would like to see instead of main.py is a bunch of code samples / demos showing how this library is supposed to be used in different circumstances.
Agreed, more examples are needed.
In details I'd like to see at least:
1 - a client example (connect(), send() a string, recv() a response, close())
Hm, that's all in urlfetch().
2 - an echo server example (accept(), recv() string, send() it back(), close()
Yes, that's missing.
3 - how to use a different transport (e.g. UDP)?
I haven't looked into this yet. I expect I'll have to write a different SocketTransport for this (the existing transports are implicitly stream-oriented) but I know that the scheduler and eventloop implementation can handle this fine.
4 - how to run long running tasks in a thread?
That's implemented. Check out call_in_thread(). Note that you can pass it an alternate threadpool (executor).
Also:
5 - is it possible to use multiple "reactors" in different threads?
Should be possible.
How? (asyncore for example achieves this by providing a separate 'map' argument for both the 'reactor' and the dispatchers)
It works by making the Context class use thread-local storage (TLS).
I understand you just started with this so I'm probably asking too much at this point in time. Feel free to consider this a kind of a "long term review".
You have asked many useful questions already. Since you have implemented a real-world I/O loop yourself, your input is extremely valuable. Thanks, and keep at it!
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- Thanks, Andrew Svetlov