Re: [Python-ideas] The async API of the future: yield-from
On Fri, Oct 19, 2012 at 10:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
If I wrote a library intended for serious use, the end user probably wouldn't write either of those. Instead he would write something like
yield from block(self.queue)
and it would be an implementation detail of the library where abouts the 'yield' happened and whether it needed to send a value or not.
What's the benefit of having both "yield" and "yield from" as opposed to just "yield"? It seems like an attractive nuisance if "yield" works but doesn't let the function have implementation details and wait for more than one thing or somesuch. With the existing generator-coroutine decorators (monocle, inlineCallbacks), there is no such trap. "yield foo()" will work no matter how many things foo() will wait for. My understanding is that the only benefit we get here is nicer tracebacks. I hope there's more. -- Devin
Devin Jeanpierre wrote:
If I wrote a library intended for serious use, the end user probably wouldn't write either of those. Instead he would write something like
yield from block(self.queue)
What's the benefit of having both "yield" and "yield from" as opposed to just "yield"? It seems like an attractive nuisance if "yield" works but doesn't let the function have implementation details and wait for more than one thing or somesuch.
The documentation would say to use "yield from", and if someone misreads that and just says "yield" instead, it's their own fault. I don't think it's worth the rather large increase in the complexity of the scheduler implementation that would be required to make "yield foo()" do the same thing as "yield from foo()" in all circumstances, just to rescue people who make this kind of mistake. It's unfortunate that "yield" and "yield from" look so similar. This is one way that cofunctions would help, by making calls to subtasks look very different from yields.
My understanding is that the only benefit we get here is nicer tracebacks. I hope there's more.
You also get a much simpler and much more efficient scheduler implementation. -- Greg
On Fri, Oct 19, 2012 at 10:27 PM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
On Fri, Oct 19, 2012 at 10:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
If I wrote a library intended for serious use, the end user probably wouldn't write either of those. Instead he would write something like
yield from block(self.queue)
and it would be an implementation detail of the library where abouts the 'yield' happened and whether it needed to send a value or not.
What's the benefit of having both "yield" and "yield from" as opposed to just "yield"? It seems like an attractive nuisance if "yield" works but doesn't let the function have implementation details and wait for more than one thing or somesuch.
With the existing generator-coroutine decorators (monocle, inlineCallbacks), there is no such trap. "yield foo()" will work no matter how many things foo() will wait for.
My understanding is that the only benefit we get here is nicer tracebacks. I hope there's more.
It is also *much* faster. In the "yield <future>" style (what I use in NDB) every level that blocks involves the creation of a Future and a bunch of code that sets its result. The scheduler has to do a lot of work to make it work. In Greg's "yield from <generator>" style most of those futures disappear, so adding extra layers of logic is much cheaper. (And believe me, in a real system, like NDB is, you have to add a lot of extra logic layers to make your API easy to use.) As a result Greg's scheduler is much simpler. (In the last week I wrote one to test this hypothesis, so I know.) I do have one concern, but it can easily be addressed. Users have the tendency to make mistakes. In NDB, a common mistake is leaving out the yield keyword. Fortunately when you do that, nothing works, so you typically find out quickly. The other mistake is found even easier: writing yield where you shouldn't. The NDB scheduler rejects values that aren't Futures so this is diagnosed precisely and with a decent stack trace. In the PEP 380 world, there will be a new type of mistake possible: writing yield instead of yield from. Fortunately the scheduler can easily test for this -- if the result of its calling next() is not None, the user yielded something. In Greg's strict design, you should never yield a value from a coroutine, so that's always an error. Even in a design where values yielded are used as scheduler instructions (albeit only by the lowest levels of the I/O wrappers), we can assume that a value yielded should never be a generator -- so the scheduler can throw back an exception if it receives a generator, and it can even hint to the user "did you mean yield from instead of yield?". The exception thrown in will show exactly the point where the from keyword is missing. (Making diagnosing cases like this more robust actually pleads for adopting Greg's strict stance.) -- --Guido van Rossum (python.org/~guido)
Guido van Rossum wrote:
In the PEP 380 world, there will be a new type of mistake possible: writing yield instead of yield from. Fortunately the scheduler can easily test for this -- if the result of its calling next() is not None, the user yielded something.
That will catch some mistakes of that kind, but not all -- it won't catch 'yield foo()' where foo() returns None. One way to fix that would be to require yielding some unique sentinel value. If the yields are all hidden inside primitives called with 'yield from', that could be kept an implementation detail. -- Greg
participants (3)
-
Devin Jeanpierre
-
Greg Ewing
-
Guido van Rossum