On Wed, Oct 31, 2012 at 3:36 PM, Steve Dower Steve.Dower@microsoft.com wrote:
Guido van Rossum wrote: There is only one reason to use 'yield from' and that is for the performance optimisation, which I do acknowledge and did observe in my own benchmarks.
Actually, it is not just optimization. The logic of the scheduler also becomes much simpler.
I know I've been vague about our intended application (deliberately so, to try and keep the discussion neutral), but I'll lay out some details.
Actually I wish you'd written this sooner. I don't know about you, but my brain has a hard time understanding abstractions that are presented without concrete use cases and implementations alongside; OTOH I delight in taking a concrete mess and extract abstractions from it. (The Twisted guys are also masters at this.)
So far I didn't really "get" the reasons you brought up for some of complications you introduced (like multiple Future implementations). Now I think I'm glimpsing your reasons.
We're working on adding support for Windows 8 apps (formerly known as Metro) written in Python. These will use the new API (WinRT) which is highly asynchronous - even operations such as opening a file are only* available as an asynchronous function. The intention is to never block on the UI thread.
Interesting. The lack of synchronous wrappers does seem a step back, but is probably useful as a forcing function given the desire to keep the UI responsive at all times.
(* Some synchronous Win32 APIs are still available from C++, but these are actively discouraged and restricted in many ways. Most of Win32 is not usable.)
The model used for these async APIs is future-based: every *Async() function returns a future for a task that is already running. The caller is not allowed to wait on this future - the only option is to attach a callback. C# and VB use their async/await keywords (good 8 min intro video on those: http://www.visualstudiolaunch.com/vs2012vle/Theater?sid=1778) while JavaScript and C++ have multi-line lambda support.
Erik Meijer introduced me to async/await on Elba two months ago. I was very excited to recognize exactly what I'd done for NDB with @tasklet and yield, supported by the type checking.
For Python, we are aiming for closer to the async/await model (which is also how we chose the names).
If we weren't so reluctant to introduce new keywords in Python we might introduce await as an alias for yield from in the future.
Incidentally, our early designs used yield from exclusively. It was only when we started discovering edge-cases where things broke, as well as the impact on code 'cleanliness', that we switched to yield.
Very interesting. I'd love to see a much longer narrative on this. (You can send it to me directly if you feel it would distract the list or if you feel it's inappropriate to share widely. I'll keep it under my hat as long as you say so.)
There are three aspects of this that work better and result in cleaner code with wattle than with tulip:
- event handlers can be "async-void", such that when the event is raised by the OS/GUI/device/whatever the handler can use asynchronous tasks without blocking the main thread.
I think this is "fire-and-forget"? I.e. you initiate an action and then just let it run until completion without ever checking the result? In tulip you currently do that by wrapping it in a Task and calling its start() method. (BTW I think I'm going to get rid of start() -- creating a Task should just start it.)
In this case, the caller receives a future but ignores it because it does not care about the final result. (We could achieve this under 'yield from' by requiring a decorator, which would then probably prevent other Python code from calling the handler directly. There is very limited opportunity for us to reliably intercept this case.)
Are you saying that this property (you don't wait for the result) is required by the operation rather than an option for the user? I'm only familiar with the latter -- e.g. I can imagine firing off an operation that writes a log entry somewhere but not caring about whether it succeeded -- but I would still make it *possible* to check on the operation if the caller cares (what if it's a very important log message?).
If there's no option for the caller, the API should present itself as a regular function/method and the task-spawning part should be hidden inside it -- I see no need for the caller to know about this.
What exactly do you mean by "reliably intercept this case" ? A concrete example would help.
- the event loop is implemented by the OS. Our Scheduler implementation does not need to provide an event loop, since we can submit() calls to the OS-level loop. This pattern also allows wattle to 'sit on top of' any other event loop, probably including Twisted and 0MQ, though I have not tried it (except with Tcl).
Ok, so what is the API offered by the OS event loop? I really want to make sure that tulip can interface with strange event loops, and this may be the most concrete example so far -- and it may be an important one.
- Future objects can be marshalled directly from Python into Windows, completing the interop story.
What do you mean by marshalled here? Surely not the stdlib marshal module. Do you just mean that Future objects can be recognized by the foreign-function interface and wrapped by / copied into native Windows 8 datatypes?
I understand your event loop understands Futures? All of them? Or only the ones of the specific type that it also returns?
Even with tulip, we would probably still require a decorator for this case so that we can marshal regular generators as iterables (for which there is a specific type).
I can't quite follow you here, probably due to lack of imagination on my part. Can you help me with a (somewhat) concrete example?
Without a decorator, we would probably have to ban both cases to prevent subtly misbehaving programs.
Concrete example?
At least with wattle, the user does not have to do anything different from any of their other @async functions.
This is because you can put type checks inside @async, which sees the function object before it's called, rather than the scheduler, which only sees what it returned, right? That's a trick I use in NDB as well and I think tulip will end up requiring a decorator too -- but it will just "mark" the function rather than wrap it in another one, unless the function is not a generator (in which case it will probably have to wrap it in something that is a generator). I could imagine a debug version of the decorator that added wrappers in all cases though.
Despite this intended application, I have tried to approach this design task independently to produce an API that will work for many cases, especially given the narrow focus on sockets. If people decide to get hung up on "the Microsoft way" or similar rubbish then I will feel vindicated for not mentioning it earlier :-) - it has not had any more influence on wattle than any of my other past experience has.
No worries about that. I agree that we need concrete examples that takes us beyond the world of sockets; it's just that sockets are where most of the interest lies (Tornado is a webserver, Twisted is often admired because of its implementations of many internet protocols, people benchmark async frameworks on how many HTTP requests per second they can serve) and I haven't worked with any type of GUI framework in a very long time. (Kudos for trying your way Tk!)