Proto-PEP on a 'yield from' statement

Comments are invited on the following proto-PEP. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing <greg.ewing@canterbury.ac.nz> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed to allow a generator to easily delegate part of its operations to another generator, with the subgenerator yielding directly to the delegating generator's caller and receiving values sent to the delegating generator using send(). Additionally, the subgenerator is allowed to return with a value and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from <expr> where <expr> is an expression evaluating to an iterator. The effect is to run the iterator to exhaustion, with any values that it yields being passed directly to the caller of the generator containing the ``yield from`` expression (the "delegating generator"), and any values sent to the delegating generator using ``send()`` being sent directly to the iterator. (If the iterator does not have a ``send()`` method, values sent in are ignored.) The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. Additionally, generators will be allowed to execute a ``return`` statement with a value, and that value will be passed as an argument to the ``StopIteration`` exception. Formal Semantics ---------------- The statement :: result = yield from iterator is semantically equivalent to :: _i = iterator try: _v = _i.next() while 1: if hasattr(_i, 'send'): _v = _i.send(_v) else: _v = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to receive values sent to the outer generator using ``send()``, it is considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this case, the advantages of a specialised syntax should be clear. The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Furthermore, using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with sent values as well as yielded ones, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg

Greg Ewing <greg.ewing@...> writes:
What is the use of it? The problem I can see is that in normal iteration forms (e.g. a "for" loop), the argument to StopIteration is ignored. Therefore, a generator executing such a return statement and expecting the caller to use the return value wouldn't be usable in normal iteration contexts.
There seems to lack at least a "yield" statement in this snippet. Also, why doesn't it call iter() first? Does it mean one couldn't write e.g. "yield from my_list"? Besides, the idea of getting the "result" from the /inner/ generator goes against the current semantics of "result = yield value", where the result comes from the /outer/ calling routine. Regards Antoine.

Antoine Pitrou wrote:
How is this different from an ordinary function returning a value that is ignored by the caller? It's up to the caller to decide whether to use the return value. If it wants the return value, then it has to either use a 'yield from' or catch the StopIteration itself and extract the value.
There seems to lack at least a "yield" statement in this snippet.
You're right, there should be some yields there. I'll post a fixed version.
Also, why doesn't it call iter() first? Does it mean one couldn't write e.g. "yield from my_list"?
I'm not sure about that. Since you can't send anything to a list iterator, there's not a huge advantage over 'for x in my_list: yield x', but I suppose it's a logical and useful thing to be able to do.
I'm going to add some more material to the Rationale section that will hopefully make it clearer why I want it to work this way. -- Greg

There were some bugs in the expansion code. Here is a corrected version. result = yield from iterable expands to _i = iter(iterable) try: _v = yield _i.next() while 1: if hasattr(_i, 'send'): _v = yield _i.send(_v) else: _v = yield _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None -- Greg

I would think that in addition to forwarding send values to the subgenerator, that throw exceptions sent to the delegating generator also be forwarded to the subgenerator. If the subgenerator does not handle the exception, then it should be re-raised in the delegating generator. Also, the subgenerator close method should be called by the delegating generator. Thus, the new expansion code would look something like: _i = iter(iterable) try: _value_to_yield = next(_i) while True: try: _value_to_send = yield _value_to_yield except Exception as _throw_from_caller: if hasattr(_i, 'throw'): _value_to_yield = _i.throw(_throw_from_caller) else: raise else: if hasattr(_i, 'send'): _value_to_yield = _i.send(_value_to_send) else: _value_to_yield = next(_i) except StopIteration as _exc_from_i: _a = _exc_from_i.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() I'm also against using return as the syntax for a final value from the subgenerator. I've accidentally used return inside generators many times and appreciate getting an error for this. I would be OK with removing the ability to return a final value from the subgenerator. Or create a new syntax that won't be used accidentally. Perhaps: finally return expr? -bruce frederiksen Greg Ewing wrote:
Comments are invited on the following proto-PEP.

Bruce Frederiksen wrote:
Urg, I'd forgotten about that feature! You're quite right.
Also, the subgenerator close method should be called by the delegating generator.
Right again. I'll add these to the next next revision, thanks.
I'm also against using return as the syntax for a final value from the subgenerator.
Can you look at what I said in the last revision about "Generators as Threads" and tell me whether you still feel that way? -- Greg

Greg Ewing wrote:
I don't really understand the "Generators as Threads". You say that a function, such as: y = f(x) could be translated into "an equivalent" generator call: y = yield from g(x) But the yield from form causes g(x) to send output to the caller, which f(x) doesn't do. It seems like I would either want one or the other: either yield from g(x) to send g's output to my caller, or y = sum(g(x)) to get a final answer myself of the generated values from g(x). On the other hand, if you're thinking that g(x) is going to be taking values from my caller (passed on to it through send) and producing a final answer, then we have a problem because g(x) will be using a yield expression to accept the values, but the yield expression also produces results which will be sent back to my caller. These results going back is probably not what I want. This is why I think that it's important to separate the aspects of sending and receiving values to/from generators. That's why I proposed receive, rather than the yield expression, to accept values in the generator. I would propose deprecating the yield expression. I would also propose changing send to only send a value into the generator and not return a result. Then you could get a sum of your input values by: y = sum(receive) without generating bogus values back to your caller. I don't know if this helps, or if I've completely missed the point that you were trying to make? ... -bruce frederiksen

Bruce Frederiksen wrote:
But the yield from form causes g(x) to send output to the caller, which f(x) doesn't do.
In the usage I'm talking about there, you're not interested in the values being yielded. You're using yields without arguments as a way of suspending the thread. So you're not calling g() for the purpose of yielding values. You're calling it for the side effects it produces, and/or the value it returns using a return statement -- the same reasons you were calling f() in the non-thread version. There are also cases where you do want to use the yielded values. For example if you have a function acting as a consumer, and a generator acting as a producer. The producer may want to spread its computation over several functions, but all the produced values should still go to the consumer. The same consideration applies if you're using send() to push values in the other direction. In that case, the outer function is the producer and the generator is the consumer. Whenever the consumer wants to get another value, it does a yield -- and the value should come from the producer, however deeply nested the yield call is. There are, of course, cases where this is not what you want. But in those cases, you don't use a 'yield from' expression -- you use a for-loop or explicit next() and send() calls to do whatever you want to do with the values being passed in and out.
There may be merit in that, but it's a separate issue, outside the scope of this PEP. And as has been pointed out, if such a change is ever made, it will carry over naturally into the semantics of 'yield from'. -- Greg

I recommend dropping the notion of forwarding from the proposal. The idea is use-case challenged, complicated, and should not be hidden behind new syntax. Would hate for this to become a trojan horse proposal when most folks just want a fast iterator pass-through mechasism: def threesomes(vars) for var in vars: yield from itertools.repeat(var, n) Raymond

Raymond Hettinger wrote:
I don't think it's conceptually complicated -- it just seems that way when you write out the Python code necessary to implement it *without* new syntax. The essential concept is that, while the subgenerator is running, everything behaves as though it were talking directly to whatever is calling the outer generator. If you leave out some of the ways that the caller can interact with a generator, such as send(), throw() and close(), then it doesn't behave exactly that way, and I think that would actually make it more complicated to understand. I should perhaps point out that the way I would implement all this would *not* be by emitting bytecode equivalent to the expansion in the PEP. It would be more along the lines of the suggested optimisation, and all the next(), send(), throw() etc. calls would go more or less directly to the subgenerator until it terminates. Done that way, I expect the implementation would actually be fairly simple and straightforward.
Would hate for this to become a trojan horse proposal when most folks just want a fast iterator pass-through mechasism:
You can use it that way if you want, without having to think about any of the other complications. -- Greg

Raymond Hettinger wrote: pass-through mechanism? I agree that 98% of the time the simple pass-through mechanism is all that will be required of this new feature. And I agree that this alone is sufficient motivation to want to see this feature added. But I have done quite a bit of work with nested generators and end up having to use itertools.chain, which also doesn't support the full generator behavior. Specifically, in my case, I needed itertools.chain to close the subgenerator so that finally clauses in the subgenerator get run when they should on jython and ironpython. I put in a request of this and was turned down. I found an alternative way to do it, but it's somewhat ugly: class chain_context(object): def __init__(self, outer_it): self.outer_it = outer_iterable(outer_it) def __enter__(self): return itertools.chain.from_iterable(self.outer_it) def __exit__(self, type, value, tb): self.outer_it.close() class outer_iterable(object): def __init__(self, outer_it): self.outer_it = iter(outer_it) self.inner_it = None def __iter__(self): return self def close(self): if hasattr(self.inner_it, '__exit__'): self.inner_it.__exit__(None, None, None) elif hasattr(self.inner_it, 'close'): self.inner_it.close() if hasattr(self.outer_it, 'close'): self.outer_it.close() def next(self): ans = self.outer_it.next() if hasattr(ans, '__enter__'): self.inner_it = ans return ans.__enter__() ans = iter(ans) self.inner_it = ans return ans and then use as: with chain_context(gen(x) for x in iterable) as it: for y in it: ... So from my own experience, I would strongly argue that the new yield from should at least honor the generator close method. Perhaps some people here have never run python with a different garbage collector that doesn't immediately reclaim garbage objects, so they don't understand the need for this. Jython and ironpython are both just coming out with their 2.5 support; so expect to hear more of these complaints in the not to distant future from that crowd... But I am baffled why the python community adopts these extra methods on generators and then refuses to support them anywhere else (for loops, itertools)? Is this a case of "well, I didn't vote for them, so I'm not going to play ball"? If that's the case, then perhaps send and throw should be retracted. I know that close is necessary when you move away from the reference counting collector, so I'll fight to keep that; as well as fight to get the rest of python to play ball with it. I haven't seen a need for send or throw myself. I've played a lot with send and it always seems to get too complicated, so I wouldn't fight for that one. I can imagine possible uses for throw, but haven't hit them yet myself in actual practice; so I'd only fight somewhat for throw. If send/throw were mistakes, let's document that and urge people not to use them and make a plan for deprecating them and removing them from the language; and figure out what the right answers are. But if send/throw/close were not mistakes and are done deals, then let's support them! In all of these cases, adding full support for send/throw/close does not require that you use any of them. It does not prevent using simple iterators rather than full blown generators. It does not diminish in any way the current capabilities of these other language features. It simply supports and allows the use of send/throw/close when needed. Otherwise, why did we put send/throw/close into the language in the first place? I would dearly love to see the for statement fully support close and throw, since that's where you use generators 99% of the time. Maybe this one needs different syntax to not break existing code. I'm not very good with clever syntax, so you may be able to improve on these: for i from gen(x): for i finally in gen(x): for i in gen(x) closing throwing: for i in final gen(x): for gen(x) yielding i: for gen(x) as i: The idea is that close should be called when the for loop terminates (for any reason), and uncaught exceptions in the for body should be sent to the generator using throw, and then only propagated outside of the for statement if they are not handled by throw. And, yes, the for statement should not do these things if a simple iterator is used rather than a generator. If you wanted to support the send method too, then maybe something like: for gen1(x) | gen2(y) as i: where the values yielded by gen1 are sent to gen2 with send, and then the values yielded by gen2 are bound to i. If this were adopted, I would also recommend that if gen2 were a function rather than a generator, then the function be called on each value yielded by gen1 and the results of the function bound to i. Then for gen(x) | fun as i: would be like: for map(fun, gen(x)) as i: Of course, this leads to simply using map rather | to combine generators by making map use send if passed a generator as it's first argument: for map(gen2(y), gen1(x)) as i: But this doesn't scale as well syntactically when you want to chain several generators together. for map(gen3(z), map(gen2(y), gen1(x))) as i: vs for gen1(x) | gen2(y) | gen3(z) as i: Unfortunately, the way that send is currently defined, gen2 can't skip values to act as a filter or generate multiple values for one value sent in. To do this would require that the operations of getting another value sent in and yielding values be separated, rather than combined as they are for send. One way to do this is to use callbacks for getting another value. This could be done using the current next semantics by simply treating the callback as an iterator and passing it as another parameter to the generator: for gen2(y, gen1(x)) as i: This is exactly what's currently being done by the itertools functions. But this also doesn't scale well syntactically when stacking up several generators. A better way would be to allow send and next to raise a new NextValue exception when the generator wants another value sent in. Then a new receive expression would be used in the generator to get the value. This would act like an iterator within the generator: def filter(pred): for var in receive: if pred(var): yield var which would be used like this down at the basic iterator level: it = filter(some_pred) for x in some_iterable: try: value = it.send(x) while True: process(value) value = next(it) except NextValue: pass and this would done automatically by the new for statement: for some_iterable | filter(some_pred) as value: process(value) this also allows generators to generate multiple values for each value received: def repeat(n): for var in receive: for i in range(n): yield var for some_iterable | repeat(3) as value: process(value) With the new yield from syntax, your threesomes example becomes: def threesomes(): yield from receive | repeat(3) Or even just: def threesomes(): return repeat(3) Other functions can be done in this style too: def map(fn): for var in receive: yield fn(var) So that stacking these all up is much more readable syntactically: for gen1(x) | filter(some_pred) | map(add_1) | threesomes() as i: You have to admit that this is much more readable than: for threesomes(map(add_1, filter(some_pred, gen1(x)))) as i: -bruce frederiksen

On Thu, Feb 12, 2009 at 10:27 PM, Bruce Frederiksen <dangyogi@gmail.com>wrote:
One of the advantages of full support of generators in a new 'yield from' is future proofing. Given the statement: The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). If, hypothetically, Python were to add some new feature to generators then those would automatically work in this context. Every place in current code which implements this kind of mechanism is not future proof. I didn't follow all the variations on the for loop, but regarding send, it seems to me that a natural case is this: for x in foo: bar = process(x) foo.send(bar) which sends the value bar to the generator and the value that comes back is used in the next iteration of the loop. I know that what I wrote doesn't do that so what I really mean is something like this but easier to write: try: x = foo.next() while True: bar = process(x) x = foo.send(bar) except StopIteration: pass and the syntax that occurs to me is: for x in foo: bar = process(x) continue bar As to chaining generators, I don't see that as a for loop-specific feature. If that's useful in a for, then it's useful outside and should stand on its own. (And I withhold judgment on that as I don't yet see a benefit to the new syntax vs. what's available today.)

Bruce Leban wrote:
The current generators become "old-style generators" and a "new-style generator" be added. The new-style generator is the same as the old-style generators w.r.t. next/throw/close and yield *statements*. But new-style generators separate the operations of getting values into the generator vs getting values out of the generator. Thus, the yield *expression* is not allowed in new style generators and is replaced by a some kind of marker (reserved word, special identifier, return from builtin function, ??) that is used to receive values into the generator. I'll call this simply receive here for now. The presence of receive is what marks the generator as a new-style generator. Receive looks like an iterator and can be used and passed around as an iterator to anything requiring an iterator within the generator. It does not return values to the caller like yield expressions do. Commensurate with receive, the send method is changed in new-style generators. It still provides a value to the generator, but no longer returns anything. This covers the use case where you want to interact with the generator, as you've indicated above. Thus, a new style generator would work just like you show in your first example, which is more intuitive than the current definition of send as passing a value both directions. So there would not be a need to change the continue statement.
Agreed. Also, a new method, called using is added to new-style generators to provide an iterator to be used as its receive object. This takes an iterator, attaches it to the generator, and returns the generator so that you can do for i in gen(x).using(iterable). This covers the use case where you have all of the input values ahead of time. And then, as a little extra syntactic sugar, the | operator would be overloaded on generators and iterators to call this using method: class iterator: ... def __or__(self, gen_b): return gen_b.using(self) Thus, when chaining generators together, you can use either: for gen1(x) | gen2(y) | gen3(z) as i: or for gen3(z).using(gen2(y).using(gen1(x))) as i: This also introduces a "new-style" for statement that properly honors the generator interface (calls close and throw like you'd expect) vs the "old-style" for statement that doesn't. The reason for the different syntax is that there may be code out there that uses a generator in a for statement with a break in it and then wants to continue with the generator in a subsequent for statement: g = gen(x) for i in g: # doesn't close g ... if cond: break for i in g: # process the rest of g's elements ... This could be done with the new-style for statement as: g = gen(x) for somelib.notclosing(g) as i: ... if cond: break for g as i: ... Comments? Should this be part of the yield from PEP? -bruce frederiksen

On Thu, Feb 12, 2009 at 8:57 PM, Bruce Frederiksen <dangyogi@gmail.com> wrote:
I'm also against using return as the syntax for a final value from the subgenerator.
Thirded, for two reasons: - A "yield x" expression has completely different semantics from "yield from x"; that's a bad idea given how similar they look. - Returning a value by stuffing it in the StopIteration abuses the exception mechanism. Without a compelling, concrete, example I'm -1 on the return part; +1 for the rest. George

George Sakkis wrote:
- A "yield x" expression has completely different semantics from "yield from x"; that's a bad idea given how similar they look.
If that's a concern, I would take it as an indication that 'yield from' is perhaps not the best syntax to use, and maybe it should be something completely new, such as y = delegate f(args) But then you lose the connection with generators that the word 'yield' gives you.
- Returning a value by stuffing it in the StopIteration abuses the exception mechanism.
I don't see why. StopIteration is already being used as an out-of-band return value to signal the end of iteration. Attaching further information to that return value doesn't seem an unreasonable thing to do. In any case, that's an implementation detail. There are other ways that the desired result could be achieved -- the desired result being the appearance of the return value as the value of the 'yield from' expression. -- Greg

George Sakkis wrote:
In thinking about this some more, what I think makes more sense is to simply return the final value from close rather than attaching it to StopIteration. This still leaves open the syntax to use for this inside the generator. Perhaps: return finally some_value -bruce frederiksen

Greg Ewing <greg.ewing@...> writes:
What is the use of it? The problem I can see is that in normal iteration forms (e.g. a "for" loop), the argument to StopIteration is ignored. Therefore, a generator executing such a return statement and expecting the caller to use the return value wouldn't be usable in normal iteration contexts.
There seems to lack at least a "yield" statement in this snippet. Also, why doesn't it call iter() first? Does it mean one couldn't write e.g. "yield from my_list"? Besides, the idea of getting the "result" from the /inner/ generator goes against the current semantics of "result = yield value", where the result comes from the /outer/ calling routine. Regards Antoine.

Antoine Pitrou wrote:
How is this different from an ordinary function returning a value that is ignored by the caller? It's up to the caller to decide whether to use the return value. If it wants the return value, then it has to either use a 'yield from' or catch the StopIteration itself and extract the value.
There seems to lack at least a "yield" statement in this snippet.
You're right, there should be some yields there. I'll post a fixed version.
Also, why doesn't it call iter() first? Does it mean one couldn't write e.g. "yield from my_list"?
I'm not sure about that. Since you can't send anything to a list iterator, there's not a huge advantage over 'for x in my_list: yield x', but I suppose it's a logical and useful thing to be able to do.
I'm going to add some more material to the Rationale section that will hopefully make it clearer why I want it to work this way. -- Greg

There were some bugs in the expansion code. Here is a corrected version. result = yield from iterable expands to _i = iter(iterable) try: _v = yield _i.next() while 1: if hasattr(_i, 'send'): _v = yield _i.send(_v) else: _v = yield _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None -- Greg

I would think that in addition to forwarding send values to the subgenerator, that throw exceptions sent to the delegating generator also be forwarded to the subgenerator. If the subgenerator does not handle the exception, then it should be re-raised in the delegating generator. Also, the subgenerator close method should be called by the delegating generator. Thus, the new expansion code would look something like: _i = iter(iterable) try: _value_to_yield = next(_i) while True: try: _value_to_send = yield _value_to_yield except Exception as _throw_from_caller: if hasattr(_i, 'throw'): _value_to_yield = _i.throw(_throw_from_caller) else: raise else: if hasattr(_i, 'send'): _value_to_yield = _i.send(_value_to_send) else: _value_to_yield = next(_i) except StopIteration as _exc_from_i: _a = _exc_from_i.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() I'm also against using return as the syntax for a final value from the subgenerator. I've accidentally used return inside generators many times and appreciate getting an error for this. I would be OK with removing the ability to return a final value from the subgenerator. Or create a new syntax that won't be used accidentally. Perhaps: finally return expr? -bruce frederiksen Greg Ewing wrote:
Comments are invited on the following proto-PEP.

Bruce Frederiksen wrote:
Urg, I'd forgotten about that feature! You're quite right.
Also, the subgenerator close method should be called by the delegating generator.
Right again. I'll add these to the next next revision, thanks.
I'm also against using return as the syntax for a final value from the subgenerator.
Can you look at what I said in the last revision about "Generators as Threads" and tell me whether you still feel that way? -- Greg

Greg Ewing wrote:
I don't really understand the "Generators as Threads". You say that a function, such as: y = f(x) could be translated into "an equivalent" generator call: y = yield from g(x) But the yield from form causes g(x) to send output to the caller, which f(x) doesn't do. It seems like I would either want one or the other: either yield from g(x) to send g's output to my caller, or y = sum(g(x)) to get a final answer myself of the generated values from g(x). On the other hand, if you're thinking that g(x) is going to be taking values from my caller (passed on to it through send) and producing a final answer, then we have a problem because g(x) will be using a yield expression to accept the values, but the yield expression also produces results which will be sent back to my caller. These results going back is probably not what I want. This is why I think that it's important to separate the aspects of sending and receiving values to/from generators. That's why I proposed receive, rather than the yield expression, to accept values in the generator. I would propose deprecating the yield expression. I would also propose changing send to only send a value into the generator and not return a result. Then you could get a sum of your input values by: y = sum(receive) without generating bogus values back to your caller. I don't know if this helps, or if I've completely missed the point that you were trying to make? ... -bruce frederiksen

Bruce Frederiksen wrote:
But the yield from form causes g(x) to send output to the caller, which f(x) doesn't do.
In the usage I'm talking about there, you're not interested in the values being yielded. You're using yields without arguments as a way of suspending the thread. So you're not calling g() for the purpose of yielding values. You're calling it for the side effects it produces, and/or the value it returns using a return statement -- the same reasons you were calling f() in the non-thread version. There are also cases where you do want to use the yielded values. For example if you have a function acting as a consumer, and a generator acting as a producer. The producer may want to spread its computation over several functions, but all the produced values should still go to the consumer. The same consideration applies if you're using send() to push values in the other direction. In that case, the outer function is the producer and the generator is the consumer. Whenever the consumer wants to get another value, it does a yield -- and the value should come from the producer, however deeply nested the yield call is. There are, of course, cases where this is not what you want. But in those cases, you don't use a 'yield from' expression -- you use a for-loop or explicit next() and send() calls to do whatever you want to do with the values being passed in and out.
There may be merit in that, but it's a separate issue, outside the scope of this PEP. And as has been pointed out, if such a change is ever made, it will carry over naturally into the semantics of 'yield from'. -- Greg

I recommend dropping the notion of forwarding from the proposal. The idea is use-case challenged, complicated, and should not be hidden behind new syntax. Would hate for this to become a trojan horse proposal when most folks just want a fast iterator pass-through mechasism: def threesomes(vars) for var in vars: yield from itertools.repeat(var, n) Raymond

Raymond Hettinger wrote:
I don't think it's conceptually complicated -- it just seems that way when you write out the Python code necessary to implement it *without* new syntax. The essential concept is that, while the subgenerator is running, everything behaves as though it were talking directly to whatever is calling the outer generator. If you leave out some of the ways that the caller can interact with a generator, such as send(), throw() and close(), then it doesn't behave exactly that way, and I think that would actually make it more complicated to understand. I should perhaps point out that the way I would implement all this would *not* be by emitting bytecode equivalent to the expansion in the PEP. It would be more along the lines of the suggested optimisation, and all the next(), send(), throw() etc. calls would go more or less directly to the subgenerator until it terminates. Done that way, I expect the implementation would actually be fairly simple and straightforward.
Would hate for this to become a trojan horse proposal when most folks just want a fast iterator pass-through mechasism:
You can use it that way if you want, without having to think about any of the other complications. -- Greg

Raymond Hettinger wrote: pass-through mechanism? I agree that 98% of the time the simple pass-through mechanism is all that will be required of this new feature. And I agree that this alone is sufficient motivation to want to see this feature added. But I have done quite a bit of work with nested generators and end up having to use itertools.chain, which also doesn't support the full generator behavior. Specifically, in my case, I needed itertools.chain to close the subgenerator so that finally clauses in the subgenerator get run when they should on jython and ironpython. I put in a request of this and was turned down. I found an alternative way to do it, but it's somewhat ugly: class chain_context(object): def __init__(self, outer_it): self.outer_it = outer_iterable(outer_it) def __enter__(self): return itertools.chain.from_iterable(self.outer_it) def __exit__(self, type, value, tb): self.outer_it.close() class outer_iterable(object): def __init__(self, outer_it): self.outer_it = iter(outer_it) self.inner_it = None def __iter__(self): return self def close(self): if hasattr(self.inner_it, '__exit__'): self.inner_it.__exit__(None, None, None) elif hasattr(self.inner_it, 'close'): self.inner_it.close() if hasattr(self.outer_it, 'close'): self.outer_it.close() def next(self): ans = self.outer_it.next() if hasattr(ans, '__enter__'): self.inner_it = ans return ans.__enter__() ans = iter(ans) self.inner_it = ans return ans and then use as: with chain_context(gen(x) for x in iterable) as it: for y in it: ... So from my own experience, I would strongly argue that the new yield from should at least honor the generator close method. Perhaps some people here have never run python with a different garbage collector that doesn't immediately reclaim garbage objects, so they don't understand the need for this. Jython and ironpython are both just coming out with their 2.5 support; so expect to hear more of these complaints in the not to distant future from that crowd... But I am baffled why the python community adopts these extra methods on generators and then refuses to support them anywhere else (for loops, itertools)? Is this a case of "well, I didn't vote for them, so I'm not going to play ball"? If that's the case, then perhaps send and throw should be retracted. I know that close is necessary when you move away from the reference counting collector, so I'll fight to keep that; as well as fight to get the rest of python to play ball with it. I haven't seen a need for send or throw myself. I've played a lot with send and it always seems to get too complicated, so I wouldn't fight for that one. I can imagine possible uses for throw, but haven't hit them yet myself in actual practice; so I'd only fight somewhat for throw. If send/throw were mistakes, let's document that and urge people not to use them and make a plan for deprecating them and removing them from the language; and figure out what the right answers are. But if send/throw/close were not mistakes and are done deals, then let's support them! In all of these cases, adding full support for send/throw/close does not require that you use any of them. It does not prevent using simple iterators rather than full blown generators. It does not diminish in any way the current capabilities of these other language features. It simply supports and allows the use of send/throw/close when needed. Otherwise, why did we put send/throw/close into the language in the first place? I would dearly love to see the for statement fully support close and throw, since that's where you use generators 99% of the time. Maybe this one needs different syntax to not break existing code. I'm not very good with clever syntax, so you may be able to improve on these: for i from gen(x): for i finally in gen(x): for i in gen(x) closing throwing: for i in final gen(x): for gen(x) yielding i: for gen(x) as i: The idea is that close should be called when the for loop terminates (for any reason), and uncaught exceptions in the for body should be sent to the generator using throw, and then only propagated outside of the for statement if they are not handled by throw. And, yes, the for statement should not do these things if a simple iterator is used rather than a generator. If you wanted to support the send method too, then maybe something like: for gen1(x) | gen2(y) as i: where the values yielded by gen1 are sent to gen2 with send, and then the values yielded by gen2 are bound to i. If this were adopted, I would also recommend that if gen2 were a function rather than a generator, then the function be called on each value yielded by gen1 and the results of the function bound to i. Then for gen(x) | fun as i: would be like: for map(fun, gen(x)) as i: Of course, this leads to simply using map rather | to combine generators by making map use send if passed a generator as it's first argument: for map(gen2(y), gen1(x)) as i: But this doesn't scale as well syntactically when you want to chain several generators together. for map(gen3(z), map(gen2(y), gen1(x))) as i: vs for gen1(x) | gen2(y) | gen3(z) as i: Unfortunately, the way that send is currently defined, gen2 can't skip values to act as a filter or generate multiple values for one value sent in. To do this would require that the operations of getting another value sent in and yielding values be separated, rather than combined as they are for send. One way to do this is to use callbacks for getting another value. This could be done using the current next semantics by simply treating the callback as an iterator and passing it as another parameter to the generator: for gen2(y, gen1(x)) as i: This is exactly what's currently being done by the itertools functions. But this also doesn't scale well syntactically when stacking up several generators. A better way would be to allow send and next to raise a new NextValue exception when the generator wants another value sent in. Then a new receive expression would be used in the generator to get the value. This would act like an iterator within the generator: def filter(pred): for var in receive: if pred(var): yield var which would be used like this down at the basic iterator level: it = filter(some_pred) for x in some_iterable: try: value = it.send(x) while True: process(value) value = next(it) except NextValue: pass and this would done automatically by the new for statement: for some_iterable | filter(some_pred) as value: process(value) this also allows generators to generate multiple values for each value received: def repeat(n): for var in receive: for i in range(n): yield var for some_iterable | repeat(3) as value: process(value) With the new yield from syntax, your threesomes example becomes: def threesomes(): yield from receive | repeat(3) Or even just: def threesomes(): return repeat(3) Other functions can be done in this style too: def map(fn): for var in receive: yield fn(var) So that stacking these all up is much more readable syntactically: for gen1(x) | filter(some_pred) | map(add_1) | threesomes() as i: You have to admit that this is much more readable than: for threesomes(map(add_1, filter(some_pred, gen1(x)))) as i: -bruce frederiksen

On Thu, Feb 12, 2009 at 10:27 PM, Bruce Frederiksen <dangyogi@gmail.com>wrote:
One of the advantages of full support of generators in a new 'yield from' is future proofing. Given the statement: The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). If, hypothetically, Python were to add some new feature to generators then those would automatically work in this context. Every place in current code which implements this kind of mechanism is not future proof. I didn't follow all the variations on the for loop, but regarding send, it seems to me that a natural case is this: for x in foo: bar = process(x) foo.send(bar) which sends the value bar to the generator and the value that comes back is used in the next iteration of the loop. I know that what I wrote doesn't do that so what I really mean is something like this but easier to write: try: x = foo.next() while True: bar = process(x) x = foo.send(bar) except StopIteration: pass and the syntax that occurs to me is: for x in foo: bar = process(x) continue bar As to chaining generators, I don't see that as a for loop-specific feature. If that's useful in a for, then it's useful outside and should stand on its own. (And I withhold judgment on that as I don't yet see a benefit to the new syntax vs. what's available today.)

Bruce Leban wrote:
The current generators become "old-style generators" and a "new-style generator" be added. The new-style generator is the same as the old-style generators w.r.t. next/throw/close and yield *statements*. But new-style generators separate the operations of getting values into the generator vs getting values out of the generator. Thus, the yield *expression* is not allowed in new style generators and is replaced by a some kind of marker (reserved word, special identifier, return from builtin function, ??) that is used to receive values into the generator. I'll call this simply receive here for now. The presence of receive is what marks the generator as a new-style generator. Receive looks like an iterator and can be used and passed around as an iterator to anything requiring an iterator within the generator. It does not return values to the caller like yield expressions do. Commensurate with receive, the send method is changed in new-style generators. It still provides a value to the generator, but no longer returns anything. This covers the use case where you want to interact with the generator, as you've indicated above. Thus, a new style generator would work just like you show in your first example, which is more intuitive than the current definition of send as passing a value both directions. So there would not be a need to change the continue statement.
Agreed. Also, a new method, called using is added to new-style generators to provide an iterator to be used as its receive object. This takes an iterator, attaches it to the generator, and returns the generator so that you can do for i in gen(x).using(iterable). This covers the use case where you have all of the input values ahead of time. And then, as a little extra syntactic sugar, the | operator would be overloaded on generators and iterators to call this using method: class iterator: ... def __or__(self, gen_b): return gen_b.using(self) Thus, when chaining generators together, you can use either: for gen1(x) | gen2(y) | gen3(z) as i: or for gen3(z).using(gen2(y).using(gen1(x))) as i: This also introduces a "new-style" for statement that properly honors the generator interface (calls close and throw like you'd expect) vs the "old-style" for statement that doesn't. The reason for the different syntax is that there may be code out there that uses a generator in a for statement with a break in it and then wants to continue with the generator in a subsequent for statement: g = gen(x) for i in g: # doesn't close g ... if cond: break for i in g: # process the rest of g's elements ... This could be done with the new-style for statement as: g = gen(x) for somelib.notclosing(g) as i: ... if cond: break for g as i: ... Comments? Should this be part of the yield from PEP? -bruce frederiksen

On Thu, Feb 12, 2009 at 8:57 PM, Bruce Frederiksen <dangyogi@gmail.com> wrote:
I'm also against using return as the syntax for a final value from the subgenerator.
Thirded, for two reasons: - A "yield x" expression has completely different semantics from "yield from x"; that's a bad idea given how similar they look. - Returning a value by stuffing it in the StopIteration abuses the exception mechanism. Without a compelling, concrete, example I'm -1 on the return part; +1 for the rest. George

George Sakkis wrote:
- A "yield x" expression has completely different semantics from "yield from x"; that's a bad idea given how similar they look.
If that's a concern, I would take it as an indication that 'yield from' is perhaps not the best syntax to use, and maybe it should be something completely new, such as y = delegate f(args) But then you lose the connection with generators that the word 'yield' gives you.
- Returning a value by stuffing it in the StopIteration abuses the exception mechanism.
I don't see why. StopIteration is already being used as an out-of-band return value to signal the end of iteration. Attaching further information to that return value doesn't seem an unreasonable thing to do. In any case, that's an implementation detail. There are other ways that the desired result could be achieved -- the desired result being the appearance of the return value as the value of the 'yield from' expression. -- Greg

George Sakkis wrote:
In thinking about this some more, what I think makes more sense is to simply return the final value from close rather than attaching it to StopIteration. This still leaves open the syntax to use for this inside the generator. Perhaps: return finally some_value -bruce frederiksen
participants (6)
-
Antoine Pitrou
-
Bruce Frederiksen
-
Bruce Leban
-
George Sakkis
-
Greg Ewing
-
Raymond Hettinger