[Twisted-Python] Async-pep (again)

Hey! So, some of you might remember my async-pep post a while ago. Some people correctly complained there was no code or text. There's some code and quite a bit of text now. In fact, it even has a PEP number (3153)! So I'm soliciting feedback again. There's an issue tracker that ideally describes what needs to be done: https://github.com/lvh/async-pep/issues Specifically, #21: https://github.com/lvh/async-pep/issues/21 (a basic implementation of an async api compatible protocol) and #22: https://github.com/lvh/async-pep/issues/22 (a trivial implementation of a backend using just the stdlib). If anyone has any thoughts on those, please chime in. cheers and thanks in advance lvh

On Wed, Jul 13, 2011 at 02:03:03PM +0200, Laurens Van Houtven wrote:
The idea of Protocols implementing Transports is vaguely gestured at as a Useful Thing, but not much detail is given. I think it would be useful for the final PEP to address that topic more rigorously - partially because it's good to have a firm basis on which to model SOCKS and SSH libraries, but mostly because figuring out how SSL should interact with TCP is going to give people headaches. Twisted, so far as I can see, just sort of punts and says "Yeah, SSL is just another transport like TCP", but then you have to make the SSL transport support all the same options that the TCP transport supports (socket options? IPv6?), but then what if you want to run SSL over a serial port or a SOCKS connection... AAAAAAAAAAAAA! In practice, it might be simpler because "SSL" means "whatever subset of TCP functionality we can coax OpenSSL into providing" rather than a fully stackable protocol-providing-a-transport. The thing with Consumers and Producers seems... very abstract. If I'm sitting down to retrieve email via POP3 (to pick a random protocol), 'transports' and 'protocols' are tools that nestle very comfortably in my mental model of the task in front of me; "consumers" and "producers" are not. Are they concepts that should be handled by transport implementors? Protocol implementors? Protocol users? Should they be mapped onto XON/XOFF or RTS/CTS by serial transports? At least in Twisted, transports and protocols do not exist in a vacuum; they have to be hooked up via the reactor. Will this PEP define a (skeletal) API to be implemented by potential reactors, or is that going to left entirely unspecified, like WSGI?

On Thu, Jul 14, 2011 at 8:48 AM, Tim Allen <screwtape@froup.com> wrote:
Cool. Can I shove those 2 paragraphs into a ticket or will the copyright monster haunt me?
Yes, Consumers and Producers are about flow control, and most Transports probably are producers.
Entirely unspecified, because different implementations have to do pretty different things.
-- cheers lvh

On Thu, Jul 14, 2011 at 10:05:00AM +0200, Laurens Van Houtven wrote:
Go right ahead! I guess most of these things should be tickets, but I don't have a GitHub account and I'm not particularly looking to register on more websites at the moment.
Having looked at the issues list after sending that message, I see this is basically issue 13, "Why are producers/consumers important, how are they different from protocols/transports?" If your PEP includes producers and consumers (and I note that the current example code doesn't, it just has a "FlowControl" class), you'll want to have an example Protocol that uses producers and consumers in some useful, illustrative fashion.
I guess the selection of available Transports is up to the hosting event-loop, too - it might be worth noting that in the section on "Transports". Unless, of course, the Transport in question is implemented by another Protocol, in which case I guess it's anybody's guess how you might hook your Protocol up. I almost think that, for pedagogy's sake, there should be an additional Encapsulator or Framer abstract class, that inherits from Protocol, but adds a .connectProtocol() method that takes another Protocol instance, and hooks itself up as that Protocol's transport. Sure, anyone who understands what's going on should be able to figure out what's going on, but I think an extra class would make it blindingly obvious, and I like APIs that save me from having to think too hard. While there's still people listening to my half-formed opinions: - Issue 7 seems to have settled on removing support for half_closing transports. I seem to recall somebody mentioning support for half-close as being one of those weird-corner cases that nobody thinks they need until they're trying to figure out why their SSH sessions always die with "broken pipe" errors. While it probably would complicate the documentation to include it, I'd hope that many frameworks that implement this PEP would want to include support for half-closing transports, and it'd be nice if there was a standard API for it instead of everybody adding their own methods with their own semantics. Perhaps there could be an HalfClosableTransport(Transport) ABC, that's optional in the same way that, say, DB-API's "connection.rollback()" method is defined but optional. - For issue 6 ("Scatter/gather IO API doesn't make sense yet"), I can't see much of a use for readv/scatter, because I imagine the benefits come from having a bunch of pre-allocated buffers lying around, and even if the Python VM had such buffers, they probably wouldn't be visible or useful to running Python code. On the other hand, I can easily imagine Python code having a bunch of independently-generated buffers that need to be written in a particular order (framing data and framed data, for example), and being able to avoid ''.join() could be a big win. Again, perhaps this could be an optional extension provided by hosting frameworks that want to implement it - although it should be pretty easy to emulate on top of the regular .write() method. - You might also want to create an optional Transport method to wrap the sendfile(2) and/or splice(2) functions. Without OS support, it'd just be a convenient way to tell the host framework to do the grunt-work of shunting bytes around; with OS support it should be a good deal more efficient than doing the same operations manually. Tim.

On Thu, Jul 14, 2011 at 07:05:38PM +1000, Tim Allen wrote:
- You might also want to create an optional Transport method to wrap the sendfile(2) and/or splice(2) functions.
I suggested this not knowing whether Python would ever grow support for sendfile(), since it seemed like the sort of thing that performance-oriented async-io frameworks might want to set up with ctypes or similar. However, I've just discovered that os.sendfile() will be in Python 3.3: http://docs.python.org/dev/library/os.html#os.sendfile Since your PEP has a 3000-series number anyway, os.sendfile() might potentially be quite useful.

On Jul 14, 2011, at 4:05 AM, Laurens Van Houtven wrote:
I thought the idea was to include an asyncore reactor interface? My assumption was that we'd provide an adapter around the Twisted reactor which would provide some basic functionality, like listenTCP, connectTCP, and callLater. -glyph

On Jul 14, 2011, at 2:48 AM, Tim Allen wrote:
Actually, you might be interested in <http://tm.tl/4854>. This will be in 11.1. TLS _is_ a protocol-that-is-a-transport now (in trunk). This was the case in 11.0, too, but only for the IOCP reactor. We've been smoothing out some interesting quirks that occurred as a result, mostly test-related, but it's looking good for the release; more robust, actually, because it's easier to test the stacked version than to try to trick sockets into returning specific values in C.
The APIs definitely aren't as nice, and that's where I predict the most discussion in the PEP.
Are they concepts that should be handled by transport implementors?
Yes, pretty much always.
Protocol implementors?
Yes, if you need them.
Protocol users?
It depends. Ideally you should be able to rely on the protocol providing a reasonable stream-friendly API. (You probably only care about this if you're writing a proxy.)
Should they be mapped onto XON/XOFF or RTS/CTS by serial transports?
Either or. Probably an option to the serial transport.

On Wed, Jul 13, 2011 at 02:03:03PM +0200, Laurens Van Houtven wrote:
The idea of Protocols implementing Transports is vaguely gestured at as a Useful Thing, but not much detail is given. I think it would be useful for the final PEP to address that topic more rigorously - partially because it's good to have a firm basis on which to model SOCKS and SSH libraries, but mostly because figuring out how SSL should interact with TCP is going to give people headaches. Twisted, so far as I can see, just sort of punts and says "Yeah, SSL is just another transport like TCP", but then you have to make the SSL transport support all the same options that the TCP transport supports (socket options? IPv6?), but then what if you want to run SSL over a serial port or a SOCKS connection... AAAAAAAAAAAAA! In practice, it might be simpler because "SSL" means "whatever subset of TCP functionality we can coax OpenSSL into providing" rather than a fully stackable protocol-providing-a-transport. The thing with Consumers and Producers seems... very abstract. If I'm sitting down to retrieve email via POP3 (to pick a random protocol), 'transports' and 'protocols' are tools that nestle very comfortably in my mental model of the task in front of me; "consumers" and "producers" are not. Are they concepts that should be handled by transport implementors? Protocol implementors? Protocol users? Should they be mapped onto XON/XOFF or RTS/CTS by serial transports? At least in Twisted, transports and protocols do not exist in a vacuum; they have to be hooked up via the reactor. Will this PEP define a (skeletal) API to be implemented by potential reactors, or is that going to left entirely unspecified, like WSGI?

On Thu, Jul 14, 2011 at 8:48 AM, Tim Allen <screwtape@froup.com> wrote:
Cool. Can I shove those 2 paragraphs into a ticket or will the copyright monster haunt me?
Yes, Consumers and Producers are about flow control, and most Transports probably are producers.
Entirely unspecified, because different implementations have to do pretty different things.
-- cheers lvh

On Thu, Jul 14, 2011 at 10:05:00AM +0200, Laurens Van Houtven wrote:
Go right ahead! I guess most of these things should be tickets, but I don't have a GitHub account and I'm not particularly looking to register on more websites at the moment.
Having looked at the issues list after sending that message, I see this is basically issue 13, "Why are producers/consumers important, how are they different from protocols/transports?" If your PEP includes producers and consumers (and I note that the current example code doesn't, it just has a "FlowControl" class), you'll want to have an example Protocol that uses producers and consumers in some useful, illustrative fashion.
I guess the selection of available Transports is up to the hosting event-loop, too - it might be worth noting that in the section on "Transports". Unless, of course, the Transport in question is implemented by another Protocol, in which case I guess it's anybody's guess how you might hook your Protocol up. I almost think that, for pedagogy's sake, there should be an additional Encapsulator or Framer abstract class, that inherits from Protocol, but adds a .connectProtocol() method that takes another Protocol instance, and hooks itself up as that Protocol's transport. Sure, anyone who understands what's going on should be able to figure out what's going on, but I think an extra class would make it blindingly obvious, and I like APIs that save me from having to think too hard. While there's still people listening to my half-formed opinions: - Issue 7 seems to have settled on removing support for half_closing transports. I seem to recall somebody mentioning support for half-close as being one of those weird-corner cases that nobody thinks they need until they're trying to figure out why their SSH sessions always die with "broken pipe" errors. While it probably would complicate the documentation to include it, I'd hope that many frameworks that implement this PEP would want to include support for half-closing transports, and it'd be nice if there was a standard API for it instead of everybody adding their own methods with their own semantics. Perhaps there could be an HalfClosableTransport(Transport) ABC, that's optional in the same way that, say, DB-API's "connection.rollback()" method is defined but optional. - For issue 6 ("Scatter/gather IO API doesn't make sense yet"), I can't see much of a use for readv/scatter, because I imagine the benefits come from having a bunch of pre-allocated buffers lying around, and even if the Python VM had such buffers, they probably wouldn't be visible or useful to running Python code. On the other hand, I can easily imagine Python code having a bunch of independently-generated buffers that need to be written in a particular order (framing data and framed data, for example), and being able to avoid ''.join() could be a big win. Again, perhaps this could be an optional extension provided by hosting frameworks that want to implement it - although it should be pretty easy to emulate on top of the regular .write() method. - You might also want to create an optional Transport method to wrap the sendfile(2) and/or splice(2) functions. Without OS support, it'd just be a convenient way to tell the host framework to do the grunt-work of shunting bytes around; with OS support it should be a good deal more efficient than doing the same operations manually. Tim.

On Thu, Jul 14, 2011 at 07:05:38PM +1000, Tim Allen wrote:
- You might also want to create an optional Transport method to wrap the sendfile(2) and/or splice(2) functions.
I suggested this not knowing whether Python would ever grow support for sendfile(), since it seemed like the sort of thing that performance-oriented async-io frameworks might want to set up with ctypes or similar. However, I've just discovered that os.sendfile() will be in Python 3.3: http://docs.python.org/dev/library/os.html#os.sendfile Since your PEP has a 3000-series number anyway, os.sendfile() might potentially be quite useful.

On Jul 14, 2011, at 4:05 AM, Laurens Van Houtven wrote:
I thought the idea was to include an asyncore reactor interface? My assumption was that we'd provide an adapter around the Twisted reactor which would provide some basic functionality, like listenTCP, connectTCP, and callLater. -glyph

On Jul 14, 2011, at 2:48 AM, Tim Allen wrote:
Actually, you might be interested in <http://tm.tl/4854>. This will be in 11.1. TLS _is_ a protocol-that-is-a-transport now (in trunk). This was the case in 11.0, too, but only for the IOCP reactor. We've been smoothing out some interesting quirks that occurred as a result, mostly test-related, but it's looking good for the release; more robust, actually, because it's easier to test the stacked version than to try to trick sockets into returning specific values in C.
The APIs definitely aren't as nice, and that's where I predict the most discussion in the PEP.
Are they concepts that should be handled by transport implementors?
Yes, pretty much always.
Protocol implementors?
Yes, if you need them.
Protocol users?
It depends. Ideally you should be able to rely on the protocol providing a reasonable stream-friendly API. (You probably only care about this if you're writing a proxy.)
Should they be mapped onto XON/XOFF or RTS/CTS by serial transports?
Either or. Probably an option to the serial transport.
participants (4)
-
exarkun@twistedmatrix.com
-
Glyph Lefkowitz
-
Laurens Van Houtven
-
Tim Allen