Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

At 10:39 PM 3/26/2009 -0500, Guido van Rossum wrote:
Could we at least have some syntax like 'return from yield with 43', to distinguish it from a regular return, clarify that it's returning a value to a yield-from statement, and emphasize that you need a yield-from to call it? If it doesn't have some sort of variant syntax, the error message for the return exception is going to need to be rather verbose in order to be clear. However, if there is a variant syntax, then an error message like "'return from yield' without 'yield from'" might be clear enough, and we can keep the current error for returning values in generators. That way, the paired special syntax is clearly identifiable as coroutine/microthread control flow, in a way that's both TOOOWTDI and EIBTI. One remaining quirk or missing piece: ISTM there needs to be a way to extract the return value without using a yield-from statement. I mean, you could write a utility function like: def unyield(geniter): try: while 1: geniter.next() except GeneratorReturn as v: return v.value OTOH, I suppose this function is still a trampoline, just one that doesn't actually do anything except return an eventual exit value. I suppose you could do a slightly improved one thus: def unyield(geniter, value=None, func=lambda v: v) try: while 1: value=func(geniter.send(value)) except GeneratorReturn as v: return v.value And drop it into itertools or some such. It's sort of like an all-purpose map/reduce for generators, so that all you need to do is pass in a function to do whatever processing you need (e.g. I/O waiting) on the values yielded. You could also use another generator's send() method as the function passed in, in which case you'd basically have a pair of coroutines... and whichever returned a value first would end up as the return value of the overall function. That'd probably be pretty useful for the sort of simple (non I/O) coroutines Greg seems to have in mind. Or, these could just be examples in the PEP, I suppose. They're not terribly difficult to write... but then I might be biased since I've written a ridiculous number of coroutine trampolines for Python generators over the last how-many-ever years Python has had generators.

P.J. Eby wrote:
You don't, though -- yield-from just happens to be a particularly convenient way. I suppose what you really mean is that you can't just use an ordinary call. But generators already have that property, whether they return values or not, and they're already syntactically marked as such by containing 'yield'. I don't see that we need a second syntactic marker.
If it doesn't have some sort of variant syntax, the error message for the return exception is going to need to be rather verbose
If we're going to treat this as an error at all, I imagine it would say something like "Return value from generator not used." RTM to sort out the details.
One remaining quirk or missing piece: ISTM there needs to be a way to extract the return value without using a yield-from statement.
I did suggest a for-loop variant for doing this, but Guido warned me not to complicate the PEP any further, so I haven't. A followup PEP for it might be in order. -- Greg

On Fri, Mar 27, 2009 at 1:06 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Because for vanilla generators a return is always a mistake.
That would be fine with me.
I don't think this is a requirement, though I expect there will be a way since the error will result into some kind of exception, which means you'll be able to catch it by explicitly invoking __next__(). (BTW, at the language summit we reached the conclusion that proposed language changes need to be aimed at Py3k first (3.1 in this case) and backported to 2.x -- 2.7 in this case. You might want to tweak the proposal to apply to 3.x by default.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

P.J. Eby wrote:
My first thought was to ask why it was not equivalent to say: x = yield g x = yield from g This would seem like a more obvious lack of parallelism to pick on wrt. return values. This unyield() operation seems contrived. Never before have you been able to write a generator that returns a value, why would these suddenly become common practice? The only place a return value seems useful is when refactoring a generator and you need to mend having loss of a shared scope. What other use is there for a return value? It would seem unfortunate for it to be considered a runtime error since this would prevent sharing a generator amongst "yield from" and non-"yield from" use cases. Although, it would be trivial to do: class K: ... def _f(): yield 1 return 2 # used internally def f() # squelch the runtime error yield from self._f() As Greg has said a number of times, we allow functions to return values with them silently being ignored all the time. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

At 03:28 AM 3/27/2009 -0400, Scott Dial wrote:
Because yield-from means you're "inlining" the generator, such that sends go into that generator, rather than into the current generator.
The use case which these things are being proposed for is to replace most of the stack-management code that's currently needed for coroutine trampolines. In such a case, you're likely using generators to perform long-running asynchronous operations, or else coroutines where two functions are co-operating to produce a result, each with its own control flow. For example, you might have a generator that yields socket objects to wait for them to be ready to read or write, then returns a line of text read from the socket. You would unyield this if you wanted to write top-level code that was *not* also such a task. Similarly, you might write coroutines where one reads data from a file and sends it to a parser, and then the parser sends data back to a main program. In either case, an unyield would either be the synchronous top-level loop of the program, or part of the top-level code. Either you need to get the finished top-level object from your parser at the end of its operation, or you are waiting for all your asynchronous I/O tasks to complete.
Has anyone shown a use case for doing so? I might be biased due to previous experience with these things, but I don't see how you write a function where both the yielded values *and* the return value are useful... and if you did, you'd still need some sort of unyield operation. Notice that in both the I/O and coroutine use cases, the point of yielding is primarily *to allow other code to execute*, and possibly pass a value back IN to the generator. The values passed *out* by the generator are usually either ignored, an indicator of what the generator wants to be passed back in, or what sort of event it is waiting for before it's to be resumed. In other words, they're usually not data -- they're just something that gets looped over as the task progresses.
As Greg has said a number of times, we allow functions to return values with them silently being ignored all the time.
Sure. But right now, the return value of a generator function *is the generator*. And you're free to ignore that, sure. But this is a "second" return value that only goes to a special place with special syntax -- without that syntax, you can't access it. But in the use cases where you'd actually want to make such a function return a value to begin with, it's because that value is the value you *really* want from the function -- the only reason it's a generator is because it needs to be paused and resumed along the way to getting that return value. If you're writing a function that yields values for other than control flow reasons, it's probably a bad idea for it to also have a "return" value.... because then you'd need an unyield operation to get at the data. And it seems to me that people are saying, "but that's no problem, I'll just use yield-from to get the value". But that doesn't *work*, because it turns the function where you use it into another generator! The generators have to *stop* somewhere, in order for you to *use* their return values -- which makes the return feature ONLY relevant to co-routine use cases -- i.e., places where you have trampolines or a top-level loop to handle the yields... And conversely, if you *have* such a generator, its real return value is the special return value, so you're not going to be able to use it outside the coroutine structure... so "ignoring its return value" doesn't make any sense. You'd have to write a loop over the generator, *just to ignore the value*... which once again is why you'd want an unyield operator of some kind. That's why special return values should be special: you have to handle them differently in order to receive that return value... and it's monumentally confusing to look at a function with a normal 'return' that never actually "returns" that value. A lot of the emails that have been written about this are failing to understand the effects of the control-flow proposed by the PEP. IMO, this should be taken as evidence that using a plain "return" statement is in fact confusing, *even to Python-Dev participants who have read the PEP*. We would be much better off with something like "yield return X" or "return from yield with X", as it would highlight this otherwise-obscure and "magical" difference in control flow.

P.J. Eby wrote:
How about if 'yield from' returns the generator object, and the return value is accessed with an attribute. g = yield from gen x = g.__value__ Or x = (yield from gen).__value__ Another possibility is to be able to break from a 'yield from' at some point and then continue it to get any final values. # yield values of sub generator g = yield from gen # get remaining unused value of sub generator x = g.next()

P.J. Eby wrote:
You don't, though -- yield-from just happens to be a particularly convenient way. I suppose what you really mean is that you can't just use an ordinary call. But generators already have that property, whether they return values or not, and they're already syntactically marked as such by containing 'yield'. I don't see that we need a second syntactic marker.
If it doesn't have some sort of variant syntax, the error message for the return exception is going to need to be rather verbose
If we're going to treat this as an error at all, I imagine it would say something like "Return value from generator not used." RTM to sort out the details.
One remaining quirk or missing piece: ISTM there needs to be a way to extract the return value without using a yield-from statement.
I did suggest a for-loop variant for doing this, but Guido warned me not to complicate the PEP any further, so I haven't. A followup PEP for it might be in order. -- Greg

On Fri, Mar 27, 2009 at 1:06 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Because for vanilla generators a return is always a mistake.
That would be fine with me.
I don't think this is a requirement, though I expect there will be a way since the error will result into some kind of exception, which means you'll be able to catch it by explicitly invoking __next__(). (BTW, at the language summit we reached the conclusion that proposed language changes need to be aimed at Py3k first (3.1 in this case) and backported to 2.x -- 2.7 in this case. You might want to tweak the proposal to apply to 3.x by default.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

P.J. Eby wrote:
My first thought was to ask why it was not equivalent to say: x = yield g x = yield from g This would seem like a more obvious lack of parallelism to pick on wrt. return values. This unyield() operation seems contrived. Never before have you been able to write a generator that returns a value, why would these suddenly become common practice? The only place a return value seems useful is when refactoring a generator and you need to mend having loss of a shared scope. What other use is there for a return value? It would seem unfortunate for it to be considered a runtime error since this would prevent sharing a generator amongst "yield from" and non-"yield from" use cases. Although, it would be trivial to do: class K: ... def _f(): yield 1 return 2 # used internally def f() # squelch the runtime error yield from self._f() As Greg has said a number of times, we allow functions to return values with them silently being ignored all the time. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

At 03:28 AM 3/27/2009 -0400, Scott Dial wrote:
Because yield-from means you're "inlining" the generator, such that sends go into that generator, rather than into the current generator.
The use case which these things are being proposed for is to replace most of the stack-management code that's currently needed for coroutine trampolines. In such a case, you're likely using generators to perform long-running asynchronous operations, or else coroutines where two functions are co-operating to produce a result, each with its own control flow. For example, you might have a generator that yields socket objects to wait for them to be ready to read or write, then returns a line of text read from the socket. You would unyield this if you wanted to write top-level code that was *not* also such a task. Similarly, you might write coroutines where one reads data from a file and sends it to a parser, and then the parser sends data back to a main program. In either case, an unyield would either be the synchronous top-level loop of the program, or part of the top-level code. Either you need to get the finished top-level object from your parser at the end of its operation, or you are waiting for all your asynchronous I/O tasks to complete.
Has anyone shown a use case for doing so? I might be biased due to previous experience with these things, but I don't see how you write a function where both the yielded values *and* the return value are useful... and if you did, you'd still need some sort of unyield operation. Notice that in both the I/O and coroutine use cases, the point of yielding is primarily *to allow other code to execute*, and possibly pass a value back IN to the generator. The values passed *out* by the generator are usually either ignored, an indicator of what the generator wants to be passed back in, or what sort of event it is waiting for before it's to be resumed. In other words, they're usually not data -- they're just something that gets looped over as the task progresses.
As Greg has said a number of times, we allow functions to return values with them silently being ignored all the time.
Sure. But right now, the return value of a generator function *is the generator*. And you're free to ignore that, sure. But this is a "second" return value that only goes to a special place with special syntax -- without that syntax, you can't access it. But in the use cases where you'd actually want to make such a function return a value to begin with, it's because that value is the value you *really* want from the function -- the only reason it's a generator is because it needs to be paused and resumed along the way to getting that return value. If you're writing a function that yields values for other than control flow reasons, it's probably a bad idea for it to also have a "return" value.... because then you'd need an unyield operation to get at the data. And it seems to me that people are saying, "but that's no problem, I'll just use yield-from to get the value". But that doesn't *work*, because it turns the function where you use it into another generator! The generators have to *stop* somewhere, in order for you to *use* their return values -- which makes the return feature ONLY relevant to co-routine use cases -- i.e., places where you have trampolines or a top-level loop to handle the yields... And conversely, if you *have* such a generator, its real return value is the special return value, so you're not going to be able to use it outside the coroutine structure... so "ignoring its return value" doesn't make any sense. You'd have to write a loop over the generator, *just to ignore the value*... which once again is why you'd want an unyield operator of some kind. That's why special return values should be special: you have to handle them differently in order to receive that return value... and it's monumentally confusing to look at a function with a normal 'return' that never actually "returns" that value. A lot of the emails that have been written about this are failing to understand the effects of the control-flow proposed by the PEP. IMO, this should be taken as evidence that using a plain "return" statement is in fact confusing, *even to Python-Dev participants who have read the PEP*. We would be much better off with something like "yield return X" or "return from yield with X", as it would highlight this otherwise-obscure and "magical" difference in control flow.

P.J. Eby wrote:
How about if 'yield from' returns the generator object, and the return value is accessed with an attribute. g = yield from gen x = g.__value__ Or x = (yield from gen).__value__ Another possibility is to be able to break from a 'yield from' at some point and then continue it to get any final values. # yield values of sub generator g = yield from gen # get remaining unused value of sub generator x = g.next()
participants (5)
-
Greg Ewing
-
Guido van Rossum
-
P.J. Eby
-
Ron Adam
-
Scott Dial