
Hi, below is some feedback on the EventLoop API as implemented in tulip. I am interested in this for an (alternate) dbus interface that I've written for Python that supports evented IO. I'm hoping tulip's EventLoop could be an abstraction as well as a default implementation that allows me to support just one event interface. I looked at it from two angles: 1. Does EventLoop provide everything that is needed from a library writer point of view? 2. Can EventLoop efficiently expose a subset of the functionality of some of the main event loop implementations out there today (i looked at libuv, libev and Qt). First some code pointers... * https://github.com/geertj/looping - Here i've implemented the EventLoop interface for libuv, libev and Qt. It includes a slightly modified version of tulip's "polling.py" where I've implemented some of the suggestions below. It also adds support for Python 2.6/2.7 as the Python Qt interface (PySide) doesn't support Python 3 yet. * https://github.com/geertj/python-dbusx - A Python interface for libdbus that supports evented IO using an EventLoop interface. This module is also tests all the different loops from "looping" by doing D-BUS tests with them (looping itself doesn't have tests yet). My main points of feedback are below: * It would be nice to have repeatable timers. Repeatable timers are expected for example by libdbus when integrating it with an event loop. Without repeatable timers, I could emulate a repeatable timer by using call_later() and adding a new timer every time the timer fires. This would be an inefficient interface though for event loops that natively support repeatable timers. This could possibly be done by adding a "repeat" argument to call_later(). * It would be nice to be a way to call a callback once per loop iteration. An example here is dispatching in libdbus. The easiest way to do this is to call dbus_connection_dispatch() every iteration of the loop (a more complicated way exists to get notifications when the dispatch status changes, but it is edge triggered and difficult to get right). This could possibly be implemented by adding a "repeat" argument to call_soon(). * A useful semantic for run_once() would be to run the callbacks for readers and writers in the same iteration as when the FD got ready. This allows for the idiom below when expecting a single event to happen on a file descriptor from outside the event loop: # handle_read() sets the "ready" flag loop.add_reader(fd, handle_read) while not ready: loop.run_once() I use this idiom for example in a blocking method_call() method that calls into a D-BUS method. Currently, the handle_read() callback would be called in the iteration *after* the FD became readable. So this would not work, unless some more IO becomes available. As far as I can see libev, libuv and Qt all work like this. * If remove_reader() / remove_writer() would accept the DelayedCall instance returned by their add_xxx() cousins, then that would allow for multiple callbacks per FD. Not all event loops support this (libuv doesn't, libev and Qt do), but for the ones that do could have their functionality could be exposed like this. For event loops that don't support this, an exception could be raised when adding multiple callbacks per FD. Support for multiple callbacks per FD could be advertised as a capability. * After a DelayedCall is cancelled, it would also be very useful to have a second method to enable it again. Having that functionality is more efficient than creating a new event. For example, the D-BUS event loop integration API has specific methods for toggling events on and off that you need to provide. * (Nitpick) Multiplexing absolute and relative timeouts for the "when" argument in call_later() is a little too smart in my view and can lead to bugs. With some input, I'd be happy to produce patches. Regards, Geert Jansen

On Mon, Dec 17, 2012 at 3:08 AM, Geert Jansen <geertj@gmail.com> wrote:
below is some feedback on the EventLoop API as implemented in tulip.
Great feedback! I hope you will focus on PEP 3156 (http://www.python.org/dev/peps/pep-3156/) and Tulip v2 next; Tulip v2 isn't written but is quickly taking shape in the 'tulip' subdirectory of the Tulip project.
Nice. The more interop this event loop offers the better. I don't know much about dbus, though, so occasionally my responses may not make any sense -- please be gentle and educate me when my ignorance gets in the way of understanding.
Cool. For me, right now, Python 2 compatibility is a distraction, but I am not against others adding it. I'll be happy to consider small tweaks to the PEP to make this easier. Exception: I'm not about to give up on 'yield from'; but that doesn't seem your focus anyway.
I'm actually glad to see there are so many event loop implementations around. This suggests to me that there's a real demand for this type of functionality, and I'd be real happy if PEP 3156 and Tulip came to improve the interop situation (especially for Python 3.3 and beyond).
I've not used repeatable timers myself but I see them in several other interfaces. I do think they deserve a different method call to set them up, even if the implementation will just be to add a repeat field to the DelayedCall. When I start a timer with a 2 second repeat, does it run now and then 2, 4, 6, ... seconds after, or should the first run be in 2 seconds? Or are these separate parameters? Strawman proposal: it runs in 2 seconds and then every 2 seconds. The API would be event_loop.call_repeatedly(interval, callback, *args), returning a DelayedCall with an interval attribute set to the interval value. (BTW, can someone *please* come up with a better name for DelayedCall? It's tedious and doesn't abbreviate well. But I don't want to name the class 'Callback' since I already use 'callback' for function objects that are used as callbacks.)
Again, I'd rather introduce a new method. What should the semantics be? Is this called just before or after we potentially go to sleep, or at some other point, or at the very top or bottom of run_once()?
* A useful semantic for run_once() would be to run the callbacks for readers and writers in the same iteration as when the FD got ready.
Good catch, I've struggled with this. I ended up not needing to call run_once(), so I've left it out of the PEP. I agree if there's a strong enough use case for it (what's yours?) it should probably be redesigned. Another thing I don't like about it is that a callback that calls call_soon() with itself will starve I/O completely. OTOH that's perhaps no worse than a callback containing an infinite loop; and there's something to say for the semantics that if a callback just schedules another callback as an immediate 'continuation', it's reasonable to run that before even attempting to poll for I/O.
Hm, okay, it seems reasonable to support that. (My original intent with run_unce() was to allow mixing multiple event loops -- you'd just call each event loop's run_once() equivalent in a round-robin fashion.) How about the following semantics for run_once(): 1. compute deadline as the smallest of: - the time until the first event in the timer heap, if non empty - 0 if the ready queue is non empty - Infinity(*) 2. poll for I/O with the computed deadline, adding anything that is ready to the ready queue 3. run items from the ready queue until it is empty (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as infinity, with the idea that if somehow a race condition added something to the ready queue just as we went to sleep, and there's no I/O at all, the system will recover eventually. But I've also heard people worried about power conservation on mobile devices (or laptops) complain about servers that wake up regularly even when there is no work to do. Thoughts? I think I'll leave this out of the PEP, but what should Tulip do?
Hm. The PEP currently states that you can call cancel() on the DelayedCall returned by e.g. add_reader() and it will act as if you called remove_reader(). (Though I haven't implemented this yet -- either there would have to be a cancel callback on the DelayedCall or the effect would be delayed.) But multiple callbacks per FD seems a different issue -- currently add_reader() just replaces the previous callback if one is already set. Since not every event loop can support this, I'm not sure it ought to be in the PEP, and making it optional sounds like a recipe for trouble (a library that depends on this may break subtly or only under pressure). Also, what's the use case? If you really need this you are free to implement a mechanism on top of the standard in user code that dispatches to multiple callbacks -- that sounds like a small amount of work if you really need it, but it sounds like an attractive nuisance to put this in the spec.
Support for multiple callbacks per FD could be advertised as a capability.
I'm not keen on having optional functionality as I explained above. (In fact, I probably will change the PEP to make those APIs that are currently marked as optional required -- it will just depend on the platform which paradigm performs better, but using the transport/protocol abstraction will automatically select the best paradigm).
Really? Doesn't this functionality imply that something (besides user code) is holding on to the DelayedCall after it is cancelled? It seems iffy to have to bend over backwards to support this alternate way of doing something that we can already do, just because (on some platform?) it might shave a microsecond off callback registration.
Agreed; that's why I left it out of the PEP. The v2 implementation will use time.monotonic(),
With some input, I'd be happy to produce patches.
I hope I've given you enough input; it's probably better to discuss the specs first before starting to code. But please do review the tulip v2 code in the tulip subdirectory; if you want to help you I'll be happy to give you commit privileges to that repo, or I'll take patches if you send them. -- --Guido van Rossum (python.org/~guido)

On Mon, 17 Dec 2012 09:47:22 -0800 Guido van Rossum <guido@python.org> wrote:
Does it need to be abbreviated? I don't think users have to spell "DelayedCall" at all (they just call call_later()). That said, some proposals: - Timer (might be mixed up with threading.Timer) - Deadline - RDV (French abbrev. for rendez-vous) Regards Antoine.

On Mon, Dec 17, 2012 at 11:57 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
They save the result in a variable. Naming that variable delayed_call feels awkward. In my code I've called it 'dcall' but that's not great either.
That said, some proposals: - Timer (might be mixed up with threading.Timer)
But often there's no time involved...
- Deadline
Same...
- RDV (French abbrev. for rendez-vous)
Hmmmm. :-) Maybe Callback is okay after all? The local variable can be 'cb'. -- --Guido van Rossum (python.org/~guido)

On Mon, 17 Dec 2012 12:49:46 -0800 Guido van Rossum <guido@python.org> wrote:
Ah, I see you use the same class for add_reader() and friends. I was assuming that, like in Twisted, DelayedCall was only returned by call_later(). Is it useful to return a DelayedCall in add_reader()? Is it so that you can remove the reader? But you already define remove_reader() for that, so I'm not sure what an alternative way to do it brings :-) Regards Antoine.

On Mon, Dec 17, 2012 at 12:56 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I'm not sure myself. I added it to the PEP (with a question mark) because I use DelayedCalls to represent I/O callbacks internally -- it's handy to have an object that represents a function plus its arguments, and I also have a shortcut for adding such objects to the ready queue (the ready queue *also* stores DelayedCalls). It is probably a mistake offering two ways to cancel an I/O callback; but I'm not sure whether to drop remove_{reader,writer} or whether to drop cancelling the callback. (The latter would means that add_{reader,writer} should not return anything.) I *think* I'll keep remove_* and drop callback cacellation, because the entity that most likely wants to revoke the callback already has the file descriptor in hand (it comes with the socket, which they need anyway so they can call its recv/send method), but they would have to hold on to the callback object separately. OTOH callback objects might make it possible to have multiple callbacks per FD, which I currently don't support. (See discussion earlier in this thread.) -- --Guido van Rossum (python.org/~guido)

Le 17/12/2012 17:47, Guido van Rossum a écrit :
It seems to me that a DelayedCall is nothing but a frozen, reified function call. That it's a reified thing is already obvious from the fact that it's an object, so how about naming it just "Call"? "Delayed" is actually only one of the possible relations between the object and the actual call - it could also represent a cancelled call, or a cached one, or ... This idea has some implications for the design: in particular, it means that .cancel() should be a method of the EventLoop, not of Call. So Call would only have the attributes 'callback' (I'd prefer 'func' or similar) and 'args', and one method to execute the call. HTH, Ronan Lamy

On Mon, Dec 17, 2012 at 12:33 PM, Ronan Lamy <ronan.lamy@gmail.com> wrote:
Call is not a bad suggestion for the name. Let me mull that over.
Not sure. Cancelling it must set a flag on the object, since the object could be buried deep inside any number of data structures owned by the event loop: e.g. the ready queue, the pollster's readers or writers (dicts mapping FD to DelayedCall), or the timer heap. When you cancel a call you don't immediately remove it from its data structure -- instead, when you get to it naturally (e.g. its time comes up) you notice that it's been cancelled and ignore it. The one place where this is awkward is when it's a FD reader or writer -- it won't come up if the FD doesn't get any new I/O, and it's even possible that the FD is closed. (I don't actually know what epoll(), kqueue() etc. do when one of the FDs is closed, but none of the behaviors I can think of are particularly convenient...) I had thought of giving the DelayedCall a 'cancel callback' that is used if/when it is cancelled, and for readers/writers it could be something that calls remove_reader/writer with the right FD. (Maybe I need multiple cancel-callbacks, in case the same object is used as a callback for multiple queues.) Hm, this gets messy. (Another think in this area: pyftpdlib's event loop keeps track of how many calls are cancelled, and if a large number are cancelled it reconstructs the heap. The use case is apparently registering lots of callbacks far in the future and then cancelling them all. Not sure how good a use case that it. But I admit that it would be easier if cancelling was a method on the event loop.) PS. Cancelling a future is a different thing. There you still want the callback to be called, you just want it to notice that the operation was cancelled. Same for tasks. -- --Guido van Rossum (python.org/~guido)

On 12/17/12 12:33 PM, Ronan Lamy wrote: that's a more obvious name, but a fun one. http://en.wikipedia.org/wiki/Thunk_(functional_programming) -Sam

On Mon, Dec 17, 2012 at 2:11 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
That's what call_soon_threadsafe() is for. But bugs happen (in either user code or library code). And yes, call_soon_threadsafe() will use a self-pipe on UNIX. (I hope someone else will write the Windows main loop.) -- --Guido van Rossum (python.org/~guido)

On Tue, Dec 18, 2012 at 1:00 AM, Guido van Rossum <guido@python.org> wrote:
I needed a self-pipe on Windows before. See below. With this, the select() based loop might work unmodified on Windows. https://gist.github.com/4325783 Of course it wouldn't be as efficient as an IOCP based loop. Regards, Geert

On Mon, Dec 17, 2012 at 11:26 PM, Geert Jansen <geertj@gmail.com> wrote:
Thanks! Before I paste this into Tulip, is there any kind of copyright on this?
Of course it wouldn't be as efficient as an IOCP based loop.
The socket loop is definitely handy on Windows in a pinch. I have plans for an IOCP-based loop based on Richard Oudkerk's 'proactor' branch of Tulip v1, but I don't have a Windows machine to test it on ATM (hopefully that'll change once I am actually at Dropbox). -- --Guido van Rossum (python.org/~guido)

On 18/12/2012 4:59pm, Guido van Rossum wrote:
polling.py in the proactor branch already had an implementation of socketpair() for Windows;-) Also note that on Windows a connecting socket needs to be added to wfds *and* xfds when you do ... = select(rfds, wfds, xfds, timeout) If the connection fails then the handle is reported as being exceptional but *not* writable. It might make sense to have add_connector()/remove_connector() which on Unix is just an alias for add_writer()/remove_writer(). This would be useful if tulip ever has a loop based on WSAPoll() for Windows (Vista and later), since WSAPoll() has an awkward bug concerning asynchronous connects. -- Richard

On Tue, Dec 18, 2012 at 11:41 AM, Richard Oudkerk <shibturn@gmail.com> wrote:
polling.py in the proactor branch already had an implementation of socketpair() for Windows;-)
D'oh! And it always uses sockets for the "self-pipe". That makes sense.
But SelectProactor in proactor.py doesn't seem to do this.
Can't we do this for all writers? (If we have to make a distinction, so be it, but it seems easy to have latent bugs if some platforms require you to make a different call but others don't care either way.) -- --Guido van Rossum (python.org/~guido)

On Mon, Dec 17, 2012 at 6:47 PM, Guido van Rossum <guido@python.org> wrote:
Correct - my focus right now is on the event loop only. I intend to have a deeper look at the coroutine scheduler as well later (right now i'm using greenlets for that).
That would work (in 2 secs, then 4, 6, ...). This is the Qt QTimer model. Both libev and libuv have a slightly more general timer that take a timeout and a repeat value. When the timeout reaches zero, the timer will fire, and if repeat != 0, it will re-seed the timeout to that value. I haven't seen any real need for such a timer where interval != repeat, and in any case it can pretty cheaply be emulated by adding a new timer on the first expiration only. So your call_repeatedly() call above should be fine.
libev uses the generic term "Watcher", libuv uses "Handle". But their APIs are structured a bit differently from tulip so i'm not sure if those names would make sense. They support many different types of events (including more esoteric events like process watches, on-fork handlers, and wall-clock timer events). Each event has its own class that named after the event type, and that inherits from "Watcher" or "Handle". When an event is created, you pass it a reference to its loop. You manage the event fully through the event instance (e.g. starting it, setting its callback and other parameters, stopping it). The loop has only a few methods, notably "run" and "run_once". So for example, you'd say: loop = Loop() timer = Timer(loop) timer.start(2.0, callback) loop.run() The advantages of this approach is that naming is easier, and that you can also have a natural place to put methods that update the event after you created it. For example, you might want to temporarily suspend a timer or change its interval. I quite liked the fresh approach taken by tulip so that's why i tried to stay within its design. However, the disadvantage is that modifying events after you've created them is difficult (unless you create one DelayedCall subtype per event in which case you're probably better off creating those events through their constructor in the first place).
That is a good question. Both libuv and libev have both options. The one that is called before we go to sleep is called a "Prepare" handler, the one after we come back from sleep a "Check" handler. The libev documentation has some words on check and prepare handlers here: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_prepare_code_an... I am not sure both are needed, but i can't oversee all the consequences.
I think doing this would work but i again can't fully oversee all the consequences. Let me play with this a little.
I had a look at libuv and libev. They take two different approaches: * libev uses a ~60 second timeout by default. This reason is subtle. Libev supports a wall-clock time event that fires when a certain wall-clock time has passed. Having a non-infinite timeout will allow it to pick up changes to the system time (e.g. by NTP), which would change when the wall-clock timer needs to run. * libuv does not have a wall-clock timer and uses an infinite timeout. In my view it would be best for tulip to use an infinite timeout unless at some point a wall-clock timer will be added. That will help with power management. Regarding race-conditions, i think they should be solved in other ways (e.g by having a special method that can post callbacks to the loop in a thread-safe way and possibly write to a self-pipe).
Right now i think that cancelling a DelayedCall is not safe. It could busy-loop if the fd is ready.
A not-so-good use case are libraries like libdbus that don't document their assumptions regarding this. For example, i have to provide an "add watch" function that creates a new watch (a watch is just a generic term for an FD event that can be read, write or read|write). I have observed that it only ever sets one read and one write watch per FD. If we go for one reader/writer per FD, then it's probably fine, but it would be nice if code that does install multiple readers/writers per FD would get an exception rather than silently updating the callback. The requirement could be that you need to remove the event before you can add a new event for the same FD.
Not that i can see. At least not for libuv and libev.
According to the libdbus documentation there is a separate function to toggle an event on/off because that could be implemented without allocating memory. But actually there's one kind-of idiomatic use for this that i've seen quite a few times in libraries. Assume you have a library that defines a connection. Often, you create two events for that connection in the constructor: a "write_event" and a "read_event". The read_event is normally enabled, but gets temporarily disabled when you need to throttle input. The write_event is normally disabled except when you get a short write on output. Just enabling/disabling these events is a bit more friendly to the programmer IMHO than having to cancel and recreate them when needed.
OK great. Let me work on this over the next couple of days and hopefully come up with something. Regards, Geert

On Mon, Dec 17, 2012 at 2:57 PM, Geert Jansen <geertj@gmail.com> wrote:
I'm trying to stick to a somewhat minimalistic design here; repeated timers sound fine; extra complexities seem redundant. (What's next -- built-in support for exponential back-off? :-)
I see. That's a fundamentally different API style, and one I'm less familiar with. DelayedCall isn't meant to be that at all -- it's just meant to be this object that (a) is sortable by time (needed for heapq) and (b) can be cancelled (useful functionality in general). I expect that at least one of the reasons for libuv etc. to do it their way is probably that the languages are different -- Python has keyword arguments to pass options, while C/C++ must use something else. Anyway, Handler sounds like a pretty good name. Let me think it over.
Ah, that's where the desire to cancel and restart a callback comes from.
I wonder how often one needs to modify an event after it's been in use for a while. The mutation API seems mostly useful to separate construction from setting various parameters (to avoid insane overloading of the constructor).
I'm still not convinced that both are needed. However they are easy to add, so if the need really does arise in practical use I am fine with evolving the API that way. Until then, let's stick to KISS.
It's hard to oversee all consequences. But it looks good to me too, so I'll implement it this way. Maybe the Twisted folks have wisdom in this area (though quite often, when pressed, they admit that their APIs are not ideal, and have warts due to backward compatibility :-).
I've not actually ever seen a use case for the wall-clock timer, so I've taken it out.
Right, a self-pipe is already there. I'll stick with infinity in Tulip, but an implementation can of course do what it wants to.
That's because I'm not done implementing it. :-) But the more I think about it the more I don't like calling cancel() on a read/write handler.
That makes sense. If we wanted to be fancy we could have several different APIs: add (must not be set), set (may be set), replace (must be set). But I think just offering the add and remove APIs is nicely minimalistic and lets you do everything else with ease. (I'll make the remove API return True if it did remove something, False otherwise.)
Never mind, this is just due to the difference in API style. I'm going to ignore it unless I get a lot more pushback.
Yeah, not gonna happen in Python. :-)
The methods on the Transport class take care of this at a higher level: pause() and resume() to suspend reading, and the write() method takes care of buffering and so on.
Excellent. Please do check back regularly for additions to the tulip subdirectory! -- --Guido van Rossum (python.org/~guido)

On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido@python.org> wrote:
Is DelayedCall a subclass of Future, like Task? If so, FutureCall might work.
If someone really does want a wall-clock timer with a given granularity, it can be handled by adding a repeating timer with that granularity (with the obvious consequences for low power modes).
Perhaps the best bet would be to have the standard API allow multiple callbacks, and emulate that on systems which don't natively support multiple callbacks for a single event? Otherwise, I don't see how an event loop could efficiently expose access to the multiple callback APIs without requiring awkward fallbacks in the code interacting with the event loop. Given that the natural fallback implementation is reasonably clear (i.e. a single callback that calls all of the other callbacks), why force reimplementing that on users rather than event loop authors? Related, the protocol/transport API design may end up needing to consider the gather/scatter problem (i.e. fanning out data from a single transport to multiple consumers, as well as feeding data from multiple producers into a single underlying transport). Actual *implementations* of such tools shouldn't be needed in the standard suite, but at least understanding how you would go about writing multiplexers and demultiplexers can be a good test of a stacked I/O design.
And the main advantage of handling that at a higher level is that suitable buffering designs are going to be transport specific. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido@python.org> wrote:
[A better name for DelayedCall]
Anyway, Handler sounds like a pretty good name. Let me think it over.
Is DelayedCall a subclass of Future, like Task? If so, FutureCall might work.
No, they're completely related. (I'm even thinking of renaming its cancel() to avoid the confusion? I still like Handler best. In fact, if I'd thought of Handler before, I wouldn't have asked for a better name. :-) Going once, going twice... [Wall-clock timers]
+1. [Multiple calls per FD]
Hm. AFAIK Twisted doesn't support this either. Antoine, do you know? I didn't see it in the Tornado event loop either.
But what's the use case? I don't think our goal should be to offer APIs for any feature that any event loop might offer. It's not quite a least-common denominator either though -- it's about offering commonly needed functionality, and interoperability. Also, event loop implementations are allowed to offer additional APIs on their implementation. If the need for multiple handlers per FD only exists on those platforms where the platform's event loop supports it, no harm is done if the functionality is only available through a platform-specific API. But still, I don't understand the use case. Possibly it is using file descriptors as a more general signaling mechanism? That sounds pretty platform specific anyway (on Windows, FDs must represent sockets). If someone shows me a real-world use case I may change my mind.
Twisted supports this for writing through its writeSequence(), which appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told me that Twisted rarely uses the platform's scatter/gather primitives, because they are so damn hard to use, and the kernel implementation often just joins the buffers together before passing it to the regular send()...) But regardless, I don't think scatter/gather would use multiple callbacks per FD. I think it would be really hard to benefit from reading into multiple buffers in Python.
And the main advantage of handling that at a higher level is that suitable buffering designs are going to be transport specific.
+1 -- --Guido van Rossum (python.org/~guido)

On Tue, Dec 18, 2012 at 2:01 PM, Guido van Rossum <guido@python.org> wrote:
Sure, but since we know this capability is offered by multiple event loops, it would be good if there was a defined way to go about exposing it.
The most likely use case that comes to mind is monitoring and debugging (i.e. the event loop equivalent of a sys.settrace). Being able to tap into a datastream (e.g. to dump it to a console or pipe it to a monitoring process) can be really powerful, and being able to do it at the Python level means you have this kind of capability even without root access to the machine to run Wireshark. There are other more obscure signal analysis use cases that occur to me, but those could readily be handled with a custom transport implementation that duplicated that data stream, so I don't think there's any reason to worry about those.
Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more a protocol thing than an event loop thing. Specifically, gather/scatter interfaces are most useful for multiplexed transports. The ones I'm particularly familiar with are traditional telephony transports like E1 links, with 15 time-division-multiplexed channels on the wire (and a signalling timeslot), as well a few different HF comms protocols. When reading from one of those, you have a demultiplexing component which is reading the serial data coming in on the wire and making it look like 15 distinct data channels from the application's point of view. Similarly, the output multiplexer takes 15 streams of data from the application and interleaves them into the single stream on the wire. The rise of packet switching means that sharing connections like that is increasingly less common, though, so gather/scatter devices are correspondingly less useful in a networking context. The only modern use cases I can think of that someone might want to handle with Python are things like sharing a single USB or classic serial connection amongst multiple data streams. However, I suspect the standard transport and protocol API definitions already proposed should also suffice for the gather/scatter use case, as such a component would largely work like any other protocol-as-transport adapter, with the difference being that there would be a many-to-one relationship between the number of interfaces on the application side and those on the communications side. (Technically, gather/scatter components can also be used the other way around to distribute a single data stream across multi transports, but that use case is even less likely to come up when programming in Python. Multi-channel HF data comms is the only possibility that really comes to mind) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 17, 2012 at 11:21 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Only if there's a use case.
I can't see how that would work. Once one callback reads the data the other callback won't see it. There's also the issue of ordering. Solving this seems easier by implementing a facade for the event loop that wraps certain callbacks, and installing it using a custom event loop policy. So, I still don't see the use case.
Right, that seems a better way to go about it.
I'm glad you talked yourself out of that objection. :-) -- --Guido van Rossum (python.org/~guido)

On Mon, 17 Dec 2012 20:01:18 -0800 Guido van Rossum <guido@python.org> wrote:
I think neither Twisted nor Tornado support it. add_reader() / add_writer() APIs are not for the end user, they are a building block for the framework to write higher-level abstractions. (although, Tornado being quite low-level, you can end up having to use add_reader() / add_writer() anyway - e.g. for UDP) It also doesn't seem to me to make a lot of sense to allow multiplexing at the event loop level. It is probably a protocol- or transport- level feature (depending on the protocol and transport, obviously :-)). Nick mentions debugging / monitoring, but I don't understand how you do that with a write callback (or a read callback, actually, since reading from a socket will consume the data and make it unavailable for other readers). You really need to do it at a protocol/transport's write()/data_received() level. Regards Antoine.

On Tue, Dec 18, 2012 at 5:29 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Yeah, monitoring probably falls into the same gather/scatter design model as demultiplexing (receive side) and multi-channel transports (transmit side). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 17, 2012 at 3:08 AM, Geert Jansen <geertj@gmail.com> wrote:
below is some feedback on the EventLoop API as implemented in tulip.
Great feedback! I hope you will focus on PEP 3156 (http://www.python.org/dev/peps/pep-3156/) and Tulip v2 next; Tulip v2 isn't written but is quickly taking shape in the 'tulip' subdirectory of the Tulip project.
Nice. The more interop this event loop offers the better. I don't know much about dbus, though, so occasionally my responses may not make any sense -- please be gentle and educate me when my ignorance gets in the way of understanding.
Cool. For me, right now, Python 2 compatibility is a distraction, but I am not against others adding it. I'll be happy to consider small tweaks to the PEP to make this easier. Exception: I'm not about to give up on 'yield from'; but that doesn't seem your focus anyway.
I'm actually glad to see there are so many event loop implementations around. This suggests to me that there's a real demand for this type of functionality, and I'd be real happy if PEP 3156 and Tulip came to improve the interop situation (especially for Python 3.3 and beyond).
I've not used repeatable timers myself but I see them in several other interfaces. I do think they deserve a different method call to set them up, even if the implementation will just be to add a repeat field to the DelayedCall. When I start a timer with a 2 second repeat, does it run now and then 2, 4, 6, ... seconds after, or should the first run be in 2 seconds? Or are these separate parameters? Strawman proposal: it runs in 2 seconds and then every 2 seconds. The API would be event_loop.call_repeatedly(interval, callback, *args), returning a DelayedCall with an interval attribute set to the interval value. (BTW, can someone *please* come up with a better name for DelayedCall? It's tedious and doesn't abbreviate well. But I don't want to name the class 'Callback' since I already use 'callback' for function objects that are used as callbacks.)
Again, I'd rather introduce a new method. What should the semantics be? Is this called just before or after we potentially go to sleep, or at some other point, or at the very top or bottom of run_once()?
* A useful semantic for run_once() would be to run the callbacks for readers and writers in the same iteration as when the FD got ready.
Good catch, I've struggled with this. I ended up not needing to call run_once(), so I've left it out of the PEP. I agree if there's a strong enough use case for it (what's yours?) it should probably be redesigned. Another thing I don't like about it is that a callback that calls call_soon() with itself will starve I/O completely. OTOH that's perhaps no worse than a callback containing an infinite loop; and there's something to say for the semantics that if a callback just schedules another callback as an immediate 'continuation', it's reasonable to run that before even attempting to poll for I/O.
Hm, okay, it seems reasonable to support that. (My original intent with run_unce() was to allow mixing multiple event loops -- you'd just call each event loop's run_once() equivalent in a round-robin fashion.) How about the following semantics for run_once(): 1. compute deadline as the smallest of: - the time until the first event in the timer heap, if non empty - 0 if the ready queue is non empty - Infinity(*) 2. poll for I/O with the computed deadline, adding anything that is ready to the ready queue 3. run items from the ready queue until it is empty (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as infinity, with the idea that if somehow a race condition added something to the ready queue just as we went to sleep, and there's no I/O at all, the system will recover eventually. But I've also heard people worried about power conservation on mobile devices (or laptops) complain about servers that wake up regularly even when there is no work to do. Thoughts? I think I'll leave this out of the PEP, but what should Tulip do?
Hm. The PEP currently states that you can call cancel() on the DelayedCall returned by e.g. add_reader() and it will act as if you called remove_reader(). (Though I haven't implemented this yet -- either there would have to be a cancel callback on the DelayedCall or the effect would be delayed.) But multiple callbacks per FD seems a different issue -- currently add_reader() just replaces the previous callback if one is already set. Since not every event loop can support this, I'm not sure it ought to be in the PEP, and making it optional sounds like a recipe for trouble (a library that depends on this may break subtly or only under pressure). Also, what's the use case? If you really need this you are free to implement a mechanism on top of the standard in user code that dispatches to multiple callbacks -- that sounds like a small amount of work if you really need it, but it sounds like an attractive nuisance to put this in the spec.
Support for multiple callbacks per FD could be advertised as a capability.
I'm not keen on having optional functionality as I explained above. (In fact, I probably will change the PEP to make those APIs that are currently marked as optional required -- it will just depend on the platform which paradigm performs better, but using the transport/protocol abstraction will automatically select the best paradigm).
Really? Doesn't this functionality imply that something (besides user code) is holding on to the DelayedCall after it is cancelled? It seems iffy to have to bend over backwards to support this alternate way of doing something that we can already do, just because (on some platform?) it might shave a microsecond off callback registration.
Agreed; that's why I left it out of the PEP. The v2 implementation will use time.monotonic(),
With some input, I'd be happy to produce patches.
I hope I've given you enough input; it's probably better to discuss the specs first before starting to code. But please do review the tulip v2 code in the tulip subdirectory; if you want to help you I'll be happy to give you commit privileges to that repo, or I'll take patches if you send them. -- --Guido van Rossum (python.org/~guido)

On Mon, 17 Dec 2012 09:47:22 -0800 Guido van Rossum <guido@python.org> wrote:
Does it need to be abbreviated? I don't think users have to spell "DelayedCall" at all (they just call call_later()). That said, some proposals: - Timer (might be mixed up with threading.Timer) - Deadline - RDV (French abbrev. for rendez-vous) Regards Antoine.

On Mon, Dec 17, 2012 at 11:57 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
They save the result in a variable. Naming that variable delayed_call feels awkward. In my code I've called it 'dcall' but that's not great either.
That said, some proposals: - Timer (might be mixed up with threading.Timer)
But often there's no time involved...
- Deadline
Same...
- RDV (French abbrev. for rendez-vous)
Hmmmm. :-) Maybe Callback is okay after all? The local variable can be 'cb'. -- --Guido van Rossum (python.org/~guido)

On Mon, 17 Dec 2012 12:49:46 -0800 Guido van Rossum <guido@python.org> wrote:
Ah, I see you use the same class for add_reader() and friends. I was assuming that, like in Twisted, DelayedCall was only returned by call_later(). Is it useful to return a DelayedCall in add_reader()? Is it so that you can remove the reader? But you already define remove_reader() for that, so I'm not sure what an alternative way to do it brings :-) Regards Antoine.

On Mon, Dec 17, 2012 at 12:56 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I'm not sure myself. I added it to the PEP (with a question mark) because I use DelayedCalls to represent I/O callbacks internally -- it's handy to have an object that represents a function plus its arguments, and I also have a shortcut for adding such objects to the ready queue (the ready queue *also* stores DelayedCalls). It is probably a mistake offering two ways to cancel an I/O callback; but I'm not sure whether to drop remove_{reader,writer} or whether to drop cancelling the callback. (The latter would means that add_{reader,writer} should not return anything.) I *think* I'll keep remove_* and drop callback cacellation, because the entity that most likely wants to revoke the callback already has the file descriptor in hand (it comes with the socket, which they need anyway so they can call its recv/send method), but they would have to hold on to the callback object separately. OTOH callback objects might make it possible to have multiple callbacks per FD, which I currently don't support. (See discussion earlier in this thread.) -- --Guido van Rossum (python.org/~guido)

Le 17/12/2012 17:47, Guido van Rossum a écrit :
It seems to me that a DelayedCall is nothing but a frozen, reified function call. That it's a reified thing is already obvious from the fact that it's an object, so how about naming it just "Call"? "Delayed" is actually only one of the possible relations between the object and the actual call - it could also represent a cancelled call, or a cached one, or ... This idea has some implications for the design: in particular, it means that .cancel() should be a method of the EventLoop, not of Call. So Call would only have the attributes 'callback' (I'd prefer 'func' or similar) and 'args', and one method to execute the call. HTH, Ronan Lamy

On Mon, Dec 17, 2012 at 12:33 PM, Ronan Lamy <ronan.lamy@gmail.com> wrote:
Call is not a bad suggestion for the name. Let me mull that over.
Not sure. Cancelling it must set a flag on the object, since the object could be buried deep inside any number of data structures owned by the event loop: e.g. the ready queue, the pollster's readers or writers (dicts mapping FD to DelayedCall), or the timer heap. When you cancel a call you don't immediately remove it from its data structure -- instead, when you get to it naturally (e.g. its time comes up) you notice that it's been cancelled and ignore it. The one place where this is awkward is when it's a FD reader or writer -- it won't come up if the FD doesn't get any new I/O, and it's even possible that the FD is closed. (I don't actually know what epoll(), kqueue() etc. do when one of the FDs is closed, but none of the behaviors I can think of are particularly convenient...) I had thought of giving the DelayedCall a 'cancel callback' that is used if/when it is cancelled, and for readers/writers it could be something that calls remove_reader/writer with the right FD. (Maybe I need multiple cancel-callbacks, in case the same object is used as a callback for multiple queues.) Hm, this gets messy. (Another think in this area: pyftpdlib's event loop keeps track of how many calls are cancelled, and if a large number are cancelled it reconstructs the heap. The use case is apparently registering lots of callbacks far in the future and then cancelling them all. Not sure how good a use case that it. But I admit that it would be easier if cancelling was a method on the event loop.) PS. Cancelling a future is a different thing. There you still want the callback to be called, you just want it to notice that the operation was cancelled. Same for tasks. -- --Guido van Rossum (python.org/~guido)

On 12/17/12 12:33 PM, Ronan Lamy wrote: that's a more obvious name, but a fun one. http://en.wikipedia.org/wiki/Thunk_(functional_programming) -Sam

On Mon, Dec 17, 2012 at 2:11 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
That's what call_soon_threadsafe() is for. But bugs happen (in either user code or library code). And yes, call_soon_threadsafe() will use a self-pipe on UNIX. (I hope someone else will write the Windows main loop.) -- --Guido van Rossum (python.org/~guido)

On Tue, Dec 18, 2012 at 1:00 AM, Guido van Rossum <guido@python.org> wrote:
I needed a self-pipe on Windows before. See below. With this, the select() based loop might work unmodified on Windows. https://gist.github.com/4325783 Of course it wouldn't be as efficient as an IOCP based loop. Regards, Geert

On Mon, Dec 17, 2012 at 11:26 PM, Geert Jansen <geertj@gmail.com> wrote:
Thanks! Before I paste this into Tulip, is there any kind of copyright on this?
Of course it wouldn't be as efficient as an IOCP based loop.
The socket loop is definitely handy on Windows in a pinch. I have plans for an IOCP-based loop based on Richard Oudkerk's 'proactor' branch of Tulip v1, but I don't have a Windows machine to test it on ATM (hopefully that'll change once I am actually at Dropbox). -- --Guido van Rossum (python.org/~guido)

On 18/12/2012 4:59pm, Guido van Rossum wrote:
polling.py in the proactor branch already had an implementation of socketpair() for Windows;-) Also note that on Windows a connecting socket needs to be added to wfds *and* xfds when you do ... = select(rfds, wfds, xfds, timeout) If the connection fails then the handle is reported as being exceptional but *not* writable. It might make sense to have add_connector()/remove_connector() which on Unix is just an alias for add_writer()/remove_writer(). This would be useful if tulip ever has a loop based on WSAPoll() for Windows (Vista and later), since WSAPoll() has an awkward bug concerning asynchronous connects. -- Richard

On Tue, Dec 18, 2012 at 11:41 AM, Richard Oudkerk <shibturn@gmail.com> wrote:
polling.py in the proactor branch already had an implementation of socketpair() for Windows;-)
D'oh! And it always uses sockets for the "self-pipe". That makes sense.
But SelectProactor in proactor.py doesn't seem to do this.
Can't we do this for all writers? (If we have to make a distinction, so be it, but it seems easy to have latent bugs if some platforms require you to make a different call but others don't care either way.) -- --Guido van Rossum (python.org/~guido)

On Mon, Dec 17, 2012 at 6:47 PM, Guido van Rossum <guido@python.org> wrote:
Correct - my focus right now is on the event loop only. I intend to have a deeper look at the coroutine scheduler as well later (right now i'm using greenlets for that).
That would work (in 2 secs, then 4, 6, ...). This is the Qt QTimer model. Both libev and libuv have a slightly more general timer that take a timeout and a repeat value. When the timeout reaches zero, the timer will fire, and if repeat != 0, it will re-seed the timeout to that value. I haven't seen any real need for such a timer where interval != repeat, and in any case it can pretty cheaply be emulated by adding a new timer on the first expiration only. So your call_repeatedly() call above should be fine.
libev uses the generic term "Watcher", libuv uses "Handle". But their APIs are structured a bit differently from tulip so i'm not sure if those names would make sense. They support many different types of events (including more esoteric events like process watches, on-fork handlers, and wall-clock timer events). Each event has its own class that named after the event type, and that inherits from "Watcher" or "Handle". When an event is created, you pass it a reference to its loop. You manage the event fully through the event instance (e.g. starting it, setting its callback and other parameters, stopping it). The loop has only a few methods, notably "run" and "run_once". So for example, you'd say: loop = Loop() timer = Timer(loop) timer.start(2.0, callback) loop.run() The advantages of this approach is that naming is easier, and that you can also have a natural place to put methods that update the event after you created it. For example, you might want to temporarily suspend a timer or change its interval. I quite liked the fresh approach taken by tulip so that's why i tried to stay within its design. However, the disadvantage is that modifying events after you've created them is difficult (unless you create one DelayedCall subtype per event in which case you're probably better off creating those events through their constructor in the first place).
That is a good question. Both libuv and libev have both options. The one that is called before we go to sleep is called a "Prepare" handler, the one after we come back from sleep a "Check" handler. The libev documentation has some words on check and prepare handlers here: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_prepare_code_an... I am not sure both are needed, but i can't oversee all the consequences.
I think doing this would work but i again can't fully oversee all the consequences. Let me play with this a little.
I had a look at libuv and libev. They take two different approaches: * libev uses a ~60 second timeout by default. This reason is subtle. Libev supports a wall-clock time event that fires when a certain wall-clock time has passed. Having a non-infinite timeout will allow it to pick up changes to the system time (e.g. by NTP), which would change when the wall-clock timer needs to run. * libuv does not have a wall-clock timer and uses an infinite timeout. In my view it would be best for tulip to use an infinite timeout unless at some point a wall-clock timer will be added. That will help with power management. Regarding race-conditions, i think they should be solved in other ways (e.g by having a special method that can post callbacks to the loop in a thread-safe way and possibly write to a self-pipe).
Right now i think that cancelling a DelayedCall is not safe. It could busy-loop if the fd is ready.
A not-so-good use case are libraries like libdbus that don't document their assumptions regarding this. For example, i have to provide an "add watch" function that creates a new watch (a watch is just a generic term for an FD event that can be read, write or read|write). I have observed that it only ever sets one read and one write watch per FD. If we go for one reader/writer per FD, then it's probably fine, but it would be nice if code that does install multiple readers/writers per FD would get an exception rather than silently updating the callback. The requirement could be that you need to remove the event before you can add a new event for the same FD.
Not that i can see. At least not for libuv and libev.
According to the libdbus documentation there is a separate function to toggle an event on/off because that could be implemented without allocating memory. But actually there's one kind-of idiomatic use for this that i've seen quite a few times in libraries. Assume you have a library that defines a connection. Often, you create two events for that connection in the constructor: a "write_event" and a "read_event". The read_event is normally enabled, but gets temporarily disabled when you need to throttle input. The write_event is normally disabled except when you get a short write on output. Just enabling/disabling these events is a bit more friendly to the programmer IMHO than having to cancel and recreate them when needed.
OK great. Let me work on this over the next couple of days and hopefully come up with something. Regards, Geert

On Mon, Dec 17, 2012 at 2:57 PM, Geert Jansen <geertj@gmail.com> wrote:
I'm trying to stick to a somewhat minimalistic design here; repeated timers sound fine; extra complexities seem redundant. (What's next -- built-in support for exponential back-off? :-)
I see. That's a fundamentally different API style, and one I'm less familiar with. DelayedCall isn't meant to be that at all -- it's just meant to be this object that (a) is sortable by time (needed for heapq) and (b) can be cancelled (useful functionality in general). I expect that at least one of the reasons for libuv etc. to do it their way is probably that the languages are different -- Python has keyword arguments to pass options, while C/C++ must use something else. Anyway, Handler sounds like a pretty good name. Let me think it over.
Ah, that's where the desire to cancel and restart a callback comes from.
I wonder how often one needs to modify an event after it's been in use for a while. The mutation API seems mostly useful to separate construction from setting various parameters (to avoid insane overloading of the constructor).
I'm still not convinced that both are needed. However they are easy to add, so if the need really does arise in practical use I am fine with evolving the API that way. Until then, let's stick to KISS.
It's hard to oversee all consequences. But it looks good to me too, so I'll implement it this way. Maybe the Twisted folks have wisdom in this area (though quite often, when pressed, they admit that their APIs are not ideal, and have warts due to backward compatibility :-).
I've not actually ever seen a use case for the wall-clock timer, so I've taken it out.
Right, a self-pipe is already there. I'll stick with infinity in Tulip, but an implementation can of course do what it wants to.
That's because I'm not done implementing it. :-) But the more I think about it the more I don't like calling cancel() on a read/write handler.
That makes sense. If we wanted to be fancy we could have several different APIs: add (must not be set), set (may be set), replace (must be set). But I think just offering the add and remove APIs is nicely minimalistic and lets you do everything else with ease. (I'll make the remove API return True if it did remove something, False otherwise.)
Never mind, this is just due to the difference in API style. I'm going to ignore it unless I get a lot more pushback.
Yeah, not gonna happen in Python. :-)
The methods on the Transport class take care of this at a higher level: pause() and resume() to suspend reading, and the write() method takes care of buffering and so on.
Excellent. Please do check back regularly for additions to the tulip subdirectory! -- --Guido van Rossum (python.org/~guido)

On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido@python.org> wrote:
Is DelayedCall a subclass of Future, like Task? If so, FutureCall might work.
If someone really does want a wall-clock timer with a given granularity, it can be handled by adding a repeating timer with that granularity (with the obvious consequences for low power modes).
Perhaps the best bet would be to have the standard API allow multiple callbacks, and emulate that on systems which don't natively support multiple callbacks for a single event? Otherwise, I don't see how an event loop could efficiently expose access to the multiple callback APIs without requiring awkward fallbacks in the code interacting with the event loop. Given that the natural fallback implementation is reasonably clear (i.e. a single callback that calls all of the other callbacks), why force reimplementing that on users rather than event loop authors? Related, the protocol/transport API design may end up needing to consider the gather/scatter problem (i.e. fanning out data from a single transport to multiple consumers, as well as feeding data from multiple producers into a single underlying transport). Actual *implementations* of such tools shouldn't be needed in the standard suite, but at least understanding how you would go about writing multiplexers and demultiplexers can be a good test of a stacked I/O design.
And the main advantage of handling that at a higher level is that suitable buffering designs are going to be transport specific. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido@python.org> wrote:
[A better name for DelayedCall]
Anyway, Handler sounds like a pretty good name. Let me think it over.
Is DelayedCall a subclass of Future, like Task? If so, FutureCall might work.
No, they're completely related. (I'm even thinking of renaming its cancel() to avoid the confusion? I still like Handler best. In fact, if I'd thought of Handler before, I wouldn't have asked for a better name. :-) Going once, going twice... [Wall-clock timers]
+1. [Multiple calls per FD]
Hm. AFAIK Twisted doesn't support this either. Antoine, do you know? I didn't see it in the Tornado event loop either.
But what's the use case? I don't think our goal should be to offer APIs for any feature that any event loop might offer. It's not quite a least-common denominator either though -- it's about offering commonly needed functionality, and interoperability. Also, event loop implementations are allowed to offer additional APIs on their implementation. If the need for multiple handlers per FD only exists on those platforms where the platform's event loop supports it, no harm is done if the functionality is only available through a platform-specific API. But still, I don't understand the use case. Possibly it is using file descriptors as a more general signaling mechanism? That sounds pretty platform specific anyway (on Windows, FDs must represent sockets). If someone shows me a real-world use case I may change my mind.
Twisted supports this for writing through its writeSequence(), which appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told me that Twisted rarely uses the platform's scatter/gather primitives, because they are so damn hard to use, and the kernel implementation often just joins the buffers together before passing it to the regular send()...) But regardless, I don't think scatter/gather would use multiple callbacks per FD. I think it would be really hard to benefit from reading into multiple buffers in Python.
And the main advantage of handling that at a higher level is that suitable buffering designs are going to be transport specific.
+1 -- --Guido van Rossum (python.org/~guido)

On Tue, Dec 18, 2012 at 2:01 PM, Guido van Rossum <guido@python.org> wrote:
Sure, but since we know this capability is offered by multiple event loops, it would be good if there was a defined way to go about exposing it.
The most likely use case that comes to mind is monitoring and debugging (i.e. the event loop equivalent of a sys.settrace). Being able to tap into a datastream (e.g. to dump it to a console or pipe it to a monitoring process) can be really powerful, and being able to do it at the Python level means you have this kind of capability even without root access to the machine to run Wireshark. There are other more obscure signal analysis use cases that occur to me, but those could readily be handled with a custom transport implementation that duplicated that data stream, so I don't think there's any reason to worry about those.
Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more a protocol thing than an event loop thing. Specifically, gather/scatter interfaces are most useful for multiplexed transports. The ones I'm particularly familiar with are traditional telephony transports like E1 links, with 15 time-division-multiplexed channels on the wire (and a signalling timeslot), as well a few different HF comms protocols. When reading from one of those, you have a demultiplexing component which is reading the serial data coming in on the wire and making it look like 15 distinct data channels from the application's point of view. Similarly, the output multiplexer takes 15 streams of data from the application and interleaves them into the single stream on the wire. The rise of packet switching means that sharing connections like that is increasingly less common, though, so gather/scatter devices are correspondingly less useful in a networking context. The only modern use cases I can think of that someone might want to handle with Python are things like sharing a single USB or classic serial connection amongst multiple data streams. However, I suspect the standard transport and protocol API definitions already proposed should also suffice for the gather/scatter use case, as such a component would largely work like any other protocol-as-transport adapter, with the difference being that there would be a many-to-one relationship between the number of interfaces on the application side and those on the communications side. (Technically, gather/scatter components can also be used the other way around to distribute a single data stream across multi transports, but that use case is even less likely to come up when programming in Python. Multi-channel HF data comms is the only possibility that really comes to mind) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 17, 2012 at 11:21 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Only if there's a use case.
I can't see how that would work. Once one callback reads the data the other callback won't see it. There's also the issue of ordering. Solving this seems easier by implementing a facade for the event loop that wraps certain callbacks, and installing it using a custom event loop policy. So, I still don't see the use case.
Right, that seems a better way to go about it.
I'm glad you talked yourself out of that objection. :-) -- --Guido van Rossum (python.org/~guido)

On Mon, 17 Dec 2012 20:01:18 -0800 Guido van Rossum <guido@python.org> wrote:
I think neither Twisted nor Tornado support it. add_reader() / add_writer() APIs are not for the end user, they are a building block for the framework to write higher-level abstractions. (although, Tornado being quite low-level, you can end up having to use add_reader() / add_writer() anyway - e.g. for UDP) It also doesn't seem to me to make a lot of sense to allow multiplexing at the event loop level. It is probably a protocol- or transport- level feature (depending on the protocol and transport, obviously :-)). Nick mentions debugging / monitoring, but I don't understand how you do that with a write callback (or a read callback, actually, since reading from a socket will consume the data and make it unavailable for other readers). You really need to do it at a protocol/transport's write()/data_received() level. Regards Antoine.

On Tue, Dec 18, 2012 at 5:29 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Yeah, monitoring probably falls into the same gather/scatter design model as demultiplexing (receive side) and multi-channel transports (transmit side). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (9)
-
Antoine Pitrou
-
Geert Jansen
-
Greg Ewing
-
Guido van Rossum
-
Nick Coghlan
-
Richard Oudkerk
-
Ronan Lamy
-
Sam Rushing
-
Terry Reedy