New subject: async: feedback on EventLoop API

Dec. 18, 2012


      Sending the demultiplexed data through 15 pipes so the application actually is dealing with 15 streams of data using single callback notifications from the event loop seems like the more KISS approach, in this case…


Shane Green 
www.umbrellacode.com
805-452-9666 | shane@umbrellacode.com

On Dec 17, 2012, at 11:21 PM, python-ideas-request@python.org wrote:
...
Send Python-ideas mailing list submissions to
  python-ideas@python.org
To subscribe or unsubscribe via the World Wide Web, visit
  http://mail.python.org/mailman/listinfo/python-ideas
or, via email, send a message with subject or body 'help' to
  python-ideas-request@python.org
You can reach the person managing the list at
  python-ideas-owner@python.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Python-ideas digest..."
Today's Topics:
1. Re: Graph class (Nick Coghlan)
  2. Re: async: feedback on EventLoop API (Guido van Rossum)
  3. Re: async: feedback on EventLoop API (Nick Coghlan)
From: Nick Coghlan <ncoghlan@gmail.com>
Subject: Re: [Python-ideas] Graph class
Date: December 17, 2012 7:26:38 PM PST
To: Hannu Krosing <hannu@krosing.net>
Cc: Vinay Sajip <vinay_sajip@yahoo.co.uk>, "python-ideas@python.org" <python-ideas@python.org>
On Mon, Dec 17, 2012 at 9:28 AM, Hannu Krosing <hannu@krosing.net> wrote:
On 12/16/2012 04:41 PM, Guido van Rossum wrote:
...
I think of graphs and trees as patterns, not data structures.
How do you draw line between what is data structure and what is pattern ?
A rough rule of thumb is that if it's harder to remember the configuration options in the API than it is to just write a purpose-specific function, it's probably better as a pattern that can be tweaked for a given use case than it is as an actual data structure.
More generally, ABCs and magic methods are used to express patterns (like iteration), which may be implemented by various data structures.
A graph library that focused on defining a good abstraction (and adapters) that allowed graph algorithms to be written that worked with multiple existing Python graph data structures could be quite interesting.
Cheers,
Nick.
-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
From: Guido van Rossum <guido@python.org>
Subject: Re: [Python-ideas] async: feedback on EventLoop API
Date: December 17, 2012 8:01:18 PM PST
To: Nick Coghlan <ncoghlan@gmail.com>, Antoine Pitrou <solipsis@pitrou.net>
Cc: python-ideas@python.org
On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido@python.org> wrote:
[A better name for DelayedCall]
...
...
Anyway, Handler sounds like a pretty good name. Let me think it over.
...
Is DelayedCall a subclass of Future, like Task? If so, FutureCall might
work.
No, they're completely related. (I'm even thinking of renaming its
cancel() to avoid the confusion?
I still like Handler best. In fact, if I'd thought of Handler before,
I wouldn't have asked for a better name. :-)
Going once, going twice...
[Wall-clock timers]
...
If someone really does want a wall-clock timer with a given granularity, it
can be handled by adding a repeating timer with that granularity (with the
obvious consequences for low power modes).
+1.
[Multiple calls per FD]
...
...
That makes sense. If we wanted to be fancy we could have several
different APIs: add (must not be set), set (may be set), replace (must
be set). But I think just offering the add and remove APIs is nicely
minimalistic and lets you do everything else with ease. (I'll make the
remove API return True if it did remove something, False otherwise.)
...
Perhaps the best bet would be to have the standard API allow multiple
callbacks, and emulate that on systems which don't natively support multiple
callbacks for a single event?
Hm. AFAIK Twisted doesn't support this either. Antoine, do you know? I
didn't see it in the Tornado event loop either.
...
Otherwise, I don't see how an event loop could efficiently expose access to
the multiple callback APIs without requiring awkward fallbacks in the code
interacting with the event loop. Given that the natural fallback
implementation is reasonably clear (i.e. a single callback that calls all of
the other callbacks), why force reimplementing that on users rather than
event loop authors?
But what's the use case?
I don't think our goal should be to offer APIs for any feature that
any event loop might offer. It's not quite a least-common denominator
either though -- it's about offering commonly needed functionality,
and interoperability.
Also, event loop implementations are allowed to offer additional APIs
on their implementation. If the need for multiple handlers per FD only
exists on those platforms where the platform's event loop supports it,
no harm is done if the functionality is only available through a
platform-specific API.
But still, I don't understand the use case. Possibly it is using file
descriptors as a more general signaling mechanism? That sounds pretty
platform specific anyway (on Windows, FDs must represent sockets).
If someone shows me a real-world use case I may change my mind.
...
Related, the protocol/transport API design may end up needing to consider
the gather/scatter problem (i.e. fanning out data from a single transport to
multiple consumers, as well as feeding data from multiple producers into a
single underlying transport). Actual *implementations* of such tools
shouldn't be needed in the standard suite, but at least understanding how
you would go about writing multiplexers and demultiplexers can be a good
test of a stacked I/O design.
Twisted supports this for writing through its writeSequence(), which
appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
me that Twisted rarely uses the platform's scatter/gather primitives,
because they are so damn hard to use, and the kernel implementation
often just joins the buffers together before passing it to the regular
send()...)
But regardless, I don't think scatter/gather would use multiple
callbacks per FD.
I think it would be really hard to benefit from reading into multiple
buffers in Python.
...
...
...
Just enabling/disabling these events is a bit more friendly to the
programmer IMHO than having to cancel and recreate them when needed.
The methods on the Transport class take care of this at a higher
level: pause() and resume() to suspend reading, and the write() method
takes care of buffering and so on.
...
And the main advantage of handling that at a higher level is that suitable
buffering designs are going to be transport specific.
+1
-- 
--Guido van Rossum (python.org/~guido)
From: Nick Coghlan <ncoghlan@gmail.com>
Subject: Re: [Python-ideas] async: feedback on EventLoop API
Date: December 17, 2012 11:21:37 PM PST
To: Guido van Rossum <guido@python.org>
Cc: Antoine Pitrou <solipsis@pitrou.net>, python-ideas@python.org
On Tue, Dec 18, 2012 at 2:01 PM, Guido van Rossum <guido@python.org> wrote:
On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:Also, event loop implementations are allowed to offer additional APIs
on their implementation. If the need for multiple handlers per FD only
exists on those platforms where the platform's event loop supports it,
no harm is done if the functionality is only available through a
platform-specific API.
Sure, but since we know this capability is offered by multiple event loops, it would be good if there was a defined way to go about exposing it.
But still, I don't understand the use case. Possibly it is using file
descriptors as a more general signaling mechanism? That sounds pretty
platform specific anyway (on Windows, FDs must represent sockets).
If someone shows me a real-world use case I may change my mind.
The most likely use case that comes to mind is monitoring and debugging (i.e. the event loop equivalent of a sys.settrace). Being able to tap into a datastream (e.g. to dump it to a console or pipe it to a monitoring process) can be really powerful, and being able to do it at the Python level means you have this kind of capability even without root access to the machine to run Wireshark.
There are other more obscure signal analysis use cases that occur to me, but those could readily be handled with a custom transport implementation that duplicated that data stream, so I don't think there's any reason to worry about those.
...
Related, the protocol/transport API design may end up needing to consider
the gather/scatter problem (i.e. fanning out data from a single transport to
multiple consumers, as well as feeding data from multiple producers into a
single underlying transport). Actual *implementations* of such tools
shouldn't be needed in the standard suite, but at least understanding how
you would go about writing multiplexers and demultiplexers can be a good
test of a stacked I/O design.
Twisted supports this for writing through its writeSequence(), which
appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
me that Twisted rarely uses the platform's scatter/gather primitives,
because they are so damn hard to use, and the kernel implementation
often just joins the buffers together before passing it to the regular
send()...)
But regardless, I don't think scatter/gather would use multiple
callbacks per FD.
I think it would be really hard to benefit from reading into multiple
buffers in Python.
Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more a protocol thing than an event loop thing.
Specifically, gather/scatter interfaces are most useful for multiplexed transports. The ones I'm particularly familiar with are traditional telephony transports like E1 links, with 15 time-division-multiplexed channels on the wire (and a signalling timeslot), as well a few different HF comms protocols. When reading from one of those, you have a demultiplexing component which is reading the serial data coming in on the wire and making it look like 15 distinct data channels from the application's point of view. Similarly, the output multiplexer takes 15 streams of data from the application and interleaves them into the single stream on the wire.
The rise of packet switching means that sharing connections like that is increasingly less common, though, so gather/scatter devices are correspondingly less useful in a networking context. The only modern use cases I can think of that someone might want to handle with Python are things like sharing a single USB or classic serial connection amongst multiple data streams. However, I suspect the standard transport and protocol API definitions already proposed should also suffice for the gather/scatter use case, as such a component would largely work like any other protocol-as-transport adapter, with the difference being that there would be a many-to-one relationship between the number of interfaces on the application side and those on the communications side.
(Technically, gather/scatter components can also be used the other way around to distribute a single data stream across multi transports, but that use case is even less likely to come up when programming in Python. Multi-channel HF data comms is the only possibility that really comes to mind)
-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Re: [Python-ideas] Python-ideas Digest, Vol 73, Issue 38

Shane Green

Shane Green

Shane Green

tags

participants (1)