The async API of the future: PEP 3153 (async-pep)
[Hopefully this is the last spin-off thread from "asyncore: included batteries don't fit"] [LvH]
If there's one take away idea from async-pep, it's reusable protocols.
[Guido]
Is there a newer version that what's on http://www.python.org/dev/peps/pep-3153/ ? It seems to be missing any specific proposals, after spending a lot of time giving a rationale and defining some terms. The version on https://github.com/lvh/async-pep doesn't seem to be any more complete.
[LvH]
Correct.
So it's totally unfinished?
If I had to change it today, I'd throw out consumers and producers and just stick to a protocol API.
Do you feel that there should be less talk about rationale?
No, but I feel that there should be some actual specification. I am also looking forward to an actual meaty bit of example code -- ISTR you mentioned you had something, but that it was incomplete, and I can't find the link.
The PEP should probably be a number of PEPs. At first sight, it seems that this number is at least four:
1. Protocol and transport abstractions, making no mention of asynchronous IO (this is what I want 3153 to be, because it's small, manageable, and virtually everyone appears to agree it's a fantastic idea)
But the devil is in the details. *What* specifically are you proposing? How would you write a protocol handler/parser without any reference to I/O? Most protocols are two-way streets -- you read some stuff, and you write some stuff, then you read some more. (HTTP may be the exception here, if you don't keep the connection open.)
It's not that there's *no* reference to IO: it's just that that reference is abstracted away in data_received and the protocol's transport object, just like Twisted's IProtocol.
The words "data_received" don't even occur in the PEP.
2. A base reactor interface
I agree that this should be a separate PEP. But I do think that in practice there will be dependencies between the different PEPs you are proposing.
Absolutely.
3. A way of structuring callbacks: probably deferreds with a built-in inlineCallbacks for people who want to write synchronous-looking code with explicit yields for asynchronous procedures
Your previous two ideas sound like you're not tied to backward compatibility with Tornado and/or Twisted (not even via an adaptation layer). Given that we're talking Python 3.4 here that's fine with me (though I think we should be careful to offer a path forward for those packages and their users, even if it means making changes to the libraries).
I'm assuming that by previous ideas you mean points 1, 2: protocol interface + reactor interface.
Yes.
I don't see why twisted's IProtocol couldn't grow an adapter for stdlib Protocols. Ditto for Tornado. Similarly, the reactor interface could be *provided* (through a fairly simple translation layer) by different implementations, including twisted.
Right.
But Twisted Deferred is pretty arcane, and I would much rather not use it as the basis of a forward-looking design. I'd much rather see what we can mooch off PEP 3148 (Futures).
I think this needs to be addressed in a separate mail, since more stuff has been said about deferreds in this thread.
Yes, that's in the thread with subject "The async API of the future: Twisted and Deferreds".
4+ adapting the stdlib tools to using these new things
We at least need to have an idea for how this could be done. We're talking serious rewrites of many of our most fundamental existing synchronous protocol libraries (e.g. httplib, email, possibly even io.TextWrapper), most of which have had only scant updates even through the Python 3 transition apart from complications to deal with the bytes/str dichotomy.
I certainly agree that this is a very large amount of work. However, it has obvious huge advantages in terms of code reuse. I'm not sure if I understand the technical barrier though. It should be quite easy to create a blocking API with a protocol implementation that doesn't care; just call data_received with all your data at once, and presto! (Since transports in general don't provide guarantees as to how bytes will arrive, existing Twisted IProtocols have to do this already anyway, and that seems to work fine.)
Hmm... I guess that depends on how your legacy code works. As Barry mentioned somewhere, the email package's feedparser() is an attempt at implementing this -- but he sounded he has doubts that it works as-is in an async environment. However I am more worried about pull-based APIs. Take (as an extreme example) the standard stream API for reading, especially TextIOWrapper. I could see how we could turn the *writing* APIs async easily enough, but I don't see how to do it for the reading end -- you can't seriously propose to read the entire file into the buffer and then satisfy all reads from memory.
Re: forward path for existing asyncore code. I don't remember this being raised as an issue. If anything, it was mentioned in passing, and I think the answer to it was something to the tune of "asyncore's API is broken, fixing it is more important than backwards compat". Essentially I agree with Guido that the important part is an upgrade path to a good third-party library, which is the part about asyncore that REALLY sucks right now.
I have the feeling that the main reason asyncore sucks is that it requires you to subclass its Dispatcher class, which has a rather treacherous interface.
There's at least a few others, but sure, that's an obvious one. Many of the objections I can raise however don't matter if there's already an *existing working solution*. I mean, sure, it can't do SSL, but if you have code that does what you want right now, then obviously SSL isn't actually needed.
I think you mean this as an indication that providing the forward path for existing asyncore apps shouldn't be rocket science, right? Sure, I don't want to worry about that, I just want to make sure that we don't *completely* paint ourselves into the wrong corner when it comes to that.
Regardless, an API upgrade is probably a good idea. I'm not sure if it should go in the first PEP: given the separation I've outlined above (which may be too spread out...), there's no obvious place to put it besides it being a new PEP.
Aren't all your proposals API upgrades?
Sorry, that was incredibly poor wording. I meant something more of an adapter: an upgrade path for existing asyncore code to new and shiny 3153 code.
Yes, now it makes sense.
Re base reactor interface: drawing maximally from the lessons learned in twisted, I think IReactorCore (start, stop, etc), IReactorTime (call later, etc), asynchronous-looking name lookup, fd handling are the important parts.
That actually sounds more concrete than I'd like a reactor interface to be. In the App Engine world, there is a definite need for a reactor, but it cannot talk about file descriptors at all -- all I/O is defined in terms of RPC operations which have their own (several layers of) async management but still need to be plugged in to user code that might want to benefit from other reactor functionality such as scheduling and placing a call at a certain moment in the future.
I have a hard time understanding how that would work well outside of something like GAE. IIUC, that level of abstraction was chosen because it made sense for GAE (and I don't disagree), but I'm not sure it makes sense here.
I think I answered this in the reactors thread -- I propose an I/O object abstraction that is not directly tied to a file descriptor, but for which a concrete implementation can be made to support file descriptors, and another to support App Engine RPC.
In this example, where would eg the select/epoll/whatever calls happen? Is it something that calls the reactor that then in turn calls whatever?
App Engine doesn't have select/epoll/whatever, so it would have a reactor implementation that doesn't use them. But the standard Unix reactor would support file descriptors using select/etc. Please respond in the reactors thread.
call_every can be implemented in terms of call_later on a separate object, so I think it should be (eg twisted.internet.task.LoopingCall). One thing that is apparently forgotten about is event loop integration. The prime way of having two event loops cooperate is *NOT* "run both in parallel", it's "have one call the other". Even though not all loops support this, I think it's important to get this as part of the interface (raise an exception for all I care if it doesn't work).
This is definitely one of the things we ought to get right. My own thoughts are slightly (perhaps only cosmetically) different again: ideally each event loop would have a primitive operation to tell it to run for a little while, and then some other code could tie several event loops together.
As an API, that's pretty close to Twisted's IReactorCore.iterate, I think. It'd work well enough. The issue is only with event loops that don't cooperate so well.
Again, a topic for the reactor thread. But I'm really hoping you'll make good on your promise of redoing async-pep, giving some actual specifications and example code, so I can play with it. -- --Guido van Rossum (python.org/~guido)
On Sat, Oct 13, 2012 at 1:22 AM, Guido van Rossum
[Hopefully this is the last spin-off thread from "asyncore: included batteries don't fit"]
So it's totally unfinished?
At the time, the people I talked to placed significantly more weight in "explain why this is necessary" than "get me something I can play with".
Do you feel that there should be less talk about rationale?
No, but I feel that there should be some actual specification. I am also looking forward to an actual meaty bit of example code -- ISTR you mentioned you had something, but that it was incomplete, and I can't find the link.
Just examples of how it would work, nothing hooked up to real code. My memory of it is more of a drowning-in-politics-and-bikeshedding kind of thing, unfortunately :) Either way, I'm okay with letting bygones be bygones and focus on how we can get this show on the road.
It's not that there's *no* reference to IO: it's just that that reference is
abstracted away in data_received and the protocol's transport object, just like Twisted's IProtocol.
The words "data_received" don't even occur in the PEP.
See above. What thread should I reply in about the pull APIs?
I just want to make sure that we don't *completely* paint ourselves into the wrong corner when it comes to that.
I don't think we have to worry about it too much. Any reasonable API I can think of makes this completely doable. But I'm really hoping you'll make good on your promise of redoing
async-pep, giving some actual specifications and example code, so I can play with it.
Takeaways: - The async API of the future is very important, and too important to be left to chance. - It requires a lot of very experienced manpower. - It requires a lot of effort to handle the hashing out of it (as we're doing here) as well as it deserves to be. I'll take as proactive a role as I can afford to take in this process, but I don't think I can do it by myself. Furthermore, it's a risk nobody wants to take: a repeat performance wouldn't be good for anyone, in particular not for Python nor myself. I've asked JP Calderone and Itamar Turner-Trauring if they would be interested in carrying this forward professionally, and they have tentatively said yes. JP's already familiar with a large part of the problem space with the implementation of the ssl module. JP and Itamar have worked together for years and have recently set up a consulting firm. Given that this is emphatically important to Python, I intend to apply for a PSF grant on their behalf to further this goal. Given their experience in the field, I expect this to be a fairly low risk endeavor.
-- --Guido van Rossum (python.org/~guido)
-- cheers lvh
On Sat, Oct 13, 2012 at 10:54 AM, Laurens Van Houtven <_@lvh.cc> wrote:
On Sat, Oct 13, 2012 at 1:22 AM, Guido van Rossum
wrote: [Hopefully this is the last spin-off thread from "asyncore: included batteries don't fit"]
So it's totally unfinished?
At the time, the people I talked to placed significantly more weight in "explain why this is necessary" than "get me something I can play with".
Odd. Were those people experienced in writing / reviewing PEPs?
Do you feel that there should be less talk about rationale?
No, but I feel that there should be some actual specification. I am also looking forward to an actual meaty bit of example code -- ISTR you mentioned you had something, but that it was incomplete, and I can't find the link.
Just examples of how it would work, nothing hooked up to real code. My memory of it is more of a drowning-in-politics-and-bikeshedding kind of thing, unfortunately :) Either way, I'm okay with letting bygones be bygones and focus on how we can get this show on the road.
Shall I just reject PEP 3153 so it doesn't distract people? Of course we can still refer to it when people ask for a rationale for the separation between transports and protocols, but it doesn't seem the PEP itself is going to be finished (correct me if I'm wrong), and as it stands it is not useful as a software specification.
It's not that there's *no* reference to IO: it's just that that reference is abstracted away in data_received and the protocol's transport object, just like Twisted's IProtocol.
The words "data_received" don't even occur in the PEP.
See above.
What thread should I reply in about the pull APIs?
Probably the yield-from thread; or the Twisted/Deferred thread.
I just want to make sure that we don't *completely* paint ourselves into the wrong corner when it comes to that.
I don't think we have to worry about it too much. Any reasonable API I can think of makes this completely doable.
Agreed that we needn't constantly worry about it. It should be enough to have some kind of reality check closer to PEP accept time.
But I'm really hoping you'll make good on your promise of redoing async-pep, giving some actual specifications and example code, so I can play with it.
Takeaways:
- The async API of the future is very important, and too important to be left to chance.
That's why we're discussing it here.
- It requires a lot of very experienced manpower.
It also requires (a certain level of) *agreement* between people with different preferences, since it's no good if the community fragments or the standard solution gets ignored by Twisted and Tornado, for example. Ideally those packages (that is, their Python 3.4 versions) would build on and extend the standard API, and for "boring" stuff (like possibly the event loop) they would just use the standard solution.
- It requires a lot of effort to handle the hashing out of it (as we're doing here) as well as it deserves to be.
Right.
I'll take as proactive a role as I can afford to take in this process, but I don't think I can do it by myself.
I hope I didn't come across as asking you that! I am just hoping that you can give some concrete, working example code showing how to do protocols and transports.
Furthermore, it's a risk nobody wants to take: a repeat performance wouldn't be good for anyone, in particular not for Python nor myself.
A repeat of what? Of the failure of PEP 3153? Don't worry about that. This time around I'm here, and since then I have got a lot of experience implementing and using a solid async library (albeit of a quite different nature than the typical socket-based stuff that most people do).
I've asked JP Calderone and Itamar Turner-Trauring if they would be interested in carrying this forward professionally, and they have tentatively said yes. JP's already familiar with a large part of the problem space with the implementation of the ssl module. JP and Itamar have worked together for years and have recently set up a consulting firm.
Insight in the right way to support SSL would be huge; it is an excellent example of a popular transport that does *not* behave like sockets, even though its abstract conceptual model is similar (a setup phase, followed by two bidirectional byte streams).
Given that this is emphatically important to Python, I intend to apply for a PSF grant on their behalf to further this goal. Given their experience in the field, I expect this to be a fairly low risk endeavor.
Famous last words. :-) -- --Guido van Rossum (python.org/~guido)
On Sun, Oct 14, 2012 at 4:39 AM, Guido van Rossum
Odd. Were those people experienced in writing / reviewing PEPs?
There were a few. Some of them were. Unfortunately the prevalent reason was politics: "make it clear that you're not just trying to get twisted in the stdlib". Given that that's been suggested both on and off-list, both now and then, I guess that wasn't entirely unreasonable (but not providing things to play with was -- the experience was just so bad I pretty much never got there).
Do you feel that there should be less talk about rationale?
No, but I feel that there should be some actual specification. I am also looking forward to an actual meaty bit of example code -- ISTR you mentioned you had something, but that it was incomplete, and I can't find the link.
Just examples of how it would work, nothing hooked up to real code. My memory of it is more of a drowning-in-politics-and-bikeshedding kind of thing, unfortunately :) Either way, I'm okay with letting bygones be bygones and focus on how we can get this show on the road.
Shall I just reject PEP 3153 so it doesn't distract people? Of course we can still refer to it when people ask for a rationale for the separation between transports and protocols, but it doesn't seem the PEP itself is going to be finished (correct me if I'm wrong), and as it stands it is not useful as a software specification.
I'm not sure that's necessary; these threads show a lot of willpower to get it done (even though that's not enough), and it's pretty easy to edit. You're certainly right that right now it's not a useful software spec; but neither would an empty new PEP be ;) --Guido van Rossum (python.org/~guido)
cheers lvh
On Sat, Oct 13, 2012 at 1:54 PM, Laurens Van Houtven <_@lvh.cc> wrote:
On Sat, Oct 13, 2012 at 1:22 AM, Guido van Rossum
wrote: [Hopefully this is the last spin-off thread from "asyncore: included batteries don't fit"]
So it's totally unfinished?
At the time, the people I talked to placed significantly more weight in "explain why this is necessary" than "get me something I can play with".
Do you feel that there should be less talk about rationale?
No, but I feel that there should be some actual specification. I am also looking forward to an actual meaty bit of example code -- ISTR you mentioned you had something, but that it was incomplete, and I can't find the link.
Just examples of how it would work, nothing hooked up to real code. My memory of it is more of a drowning-in-politics-and-bikeshedding kind of thing, unfortunately :) Either way, I'm okay with letting bygones be bygones and focus on how we can get this show on the road.
It's not that there's *no* reference to IO: it's just that that reference is abstracted away in data_received and the protocol's transport object, just like Twisted's IProtocol.
The words "data_received" don't even occur in the PEP.
See above.
What thread should I reply in about the pull APIs?
I just want to make sure that we don't *completely* paint ourselves into the wrong corner when it comes to that.
I don't think we have to worry about it too much. Any reasonable API I can think of makes this completely doable.
But I'm really hoping you'll make good on your promise of redoing async-pep, giving some actual specifications and example code, so I can play with it.
Takeaways:
- The async API of the future is very important, and too important to be left to chance.
Could not agree more.
- It requires a lot of very experienced manpower.
I'm sitting on the sidelines, wishing I had much of either, because of point number 1.
- It requires a lot of effort to handle the hashing out of it (as we're doing here) as well as it deserves to be.
I'll take as proactive a role as I can afford to take in this process, but I don't think I can do it by myself. Furthermore, it's a risk nobody wants to take: a repeat performance wouldn't be good for anyone, in particular not for Python nor myself.
I've asked JP Calderone and Itamar Turner-Trauring if they would be interested in carrying this forward professionally, and they have tentatively said yes. JP's already familiar with a large part of the problem space with the implementation of the ssl module. JP and Itamar have worked together for years and have recently set up a consulting firm.
Given that this is emphatically important to Python, I intend to apply for a PSF grant on their behalf to further this goal. Given their experience in the field, I expect this to be a fairly low risk endeavor.
I like this idea. There are some problems spare time isn't enough to solve. I can't think of many people as qualified for the task.
-- --Guido van Rossum (python.org/~guido)
-- cheers lvh
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy
participants (3)
-
Calvin Spealman
-
Guido van Rossum
-
Laurens Van Houtven