Mailman 3 PEP 380 (yield from a subgenerator) comments - Python-Dev

newer
Re: [Python-Dev] PEP 380 (yield...

PEP 380 (yield from a subgenerator) comments

older
version compare function into main...

Nick Coghlan

March 21, 2009

6:45 a.m.

I really like the PEP - it's a solid extension of the ideas introduced by PEP 342. The two changes I would suggest is that the PEP be made more explicit regarding the fact that the try/finally block only enclose the yield expression itself (i.e. no other parts of the containing statement) and that the caching comment be updated with a list of specific semantic elements that the caching should not affect. For the first part, I would prefer if the example was changed to use capitals for the variant non-keyword parts of the statement: RESULT = yield from EXPR And that it formally expanded to: _i = iter(EXPR) try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _expr_result = _e.value finally: _m = getattr(_i, 'close', None) if _m is not None: _m() RESULT = _expr_result I believe writing it that way would make it clearer that the scope of the try/finally block doesn't include the assignment part of the statement. For the second part, the specific semantics that I believe should be noted as not changing even if an implementation chooses to cache the bound methods are these: - The "send" and "throw" methods of the subiterator should not be retrieved if those methods are never called on the delegating generator - If "send" is called on the delegating generator and the subiterator has no "send" method, then an appropriate "AttributeError" should be raised in the delegating generator - If retrieving the "next", "send" or "throw" methods from the subiterator results in an exception then the subiterator's "close" method (if it has one) should still be called Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Show replies by date

Antoine Pitrou

March 2009

12:27 p.m.

Nick Coghlan <ncoghlan <at> gmail.com> writes:

...

And that it formally expanded to:

<snip 20 lines of code with multiple try/except/finally clauses and various

conditionals> Do we really want to add a syntactic feature which has such a complicated expansion? I fear it will make code using "yield from" much more difficult to understand and audit.

Nick Coghlan

1:49 p.m.

Antoine Pitrou wrote:

...

Nick Coghlan <ncoghlan <at> gmail.com> writes:

...
And that it formally expanded to:

<snip 20 lines of code with multiple try/except/finally clauses and various conditionals>

Do we really want to add a syntactic feature which has such a complicated expansion? I fear it will make code using "yield from" much more difficult to understand and audit.

Yes, I think we do. The previous argument against explicit syntactic support for invoking subiterators was that it was trivial to do so by iterating over the subiterator and yielding each item in turn. With the additional generator features introduced by PEP 342, that is no longer the case: as described in Greg's PEP, simple iteration doesn't support send() and throw() correctly. The gymnastics needed to support send() and throw() actually aren't that complex when you break them down, but they aren't trivial either. Whether or not different people will find code using "yield from" difficult to understand or not will have more to do with their grasp of the concepts of cooperative multitasking in general more so than the underlying trickery involved in allowing truly nested generators. Here's an annotated version of the expansion that will hopefully make things clearer: # Create the subiterator _i = iter(EXPR) # Outer try block serves two purposes: # - retrieve expression result from StopIteration instance # - ensure _i.close() is called if it exists try: # Get first value to be yielded _u = _i.next() while 1: # Inner try block allows exceptions passed in via # the generator's throw() method to be passed to # the subiterator try: _v = yield _u except Exception, _e: # An exception was thrown into this # generator. If the subiterator has # a throw() method, then we pass the # exception down. Otherwise, we # propagate the exception in the # current generator # Note that SystemExit and # GeneratorExit are never passed down. # For those, we rely on the close() # call in the outer finally block _m = getattr(_i, 'throw', None) if _m is not None: # throw() will either yield # a new value, raise StopIteration # or reraise the original exception _u = _m(_e) else: raise else: if _v is None: # Get the next subiterator value _u = _i.next() else: # A value was passed in using # send(), so attempt to pass it # down to the subiterator. # AttributeError will be raised # if the subiterator doesn't # provide a send() method _u = _i.send(_v) except StopIteration, _e: # Subiterator ended, get the expression result _expr_result = _e.value finally: # Ensure close() is called if it exists _m = getattr(_i, 'close', None) if _m is not None: _m() RESULT = _expr_result On further reflection (and after reading a couple more posts on python-ideas relating to this PEP), I have two more questions/concerns: 1. The inner try/except is completely pointless if the subiterator doesn't have a throw() method. Would it make sense to have two versions of the inner loop (with and without the try block) and choose which one to use based on whether or not the subiterator has a throw() method? (Probably not, since this PEP is mainly about generators as cooperative pseudo-threads and in such situations all iterators involved are likely to be generators and hence have throw() methods. However, I think the question is at least worth thinking about.) 2. Due to a couple of bug reports against 2.5, contextlib.GeneratorContextManager now takes extra care when handling exceptions to avoid accidentally suppressing explicitly thrown in StopIteration instances. However, the current expansion in PEP 380 doesn't check if the StopIteration caught by the outer try statement was one that was originally thrown into the generator rather than an indicator that the subiterator naturally reached the end of its execution. That isn't a difficult behaviour to eliminate, but it does require a slight change to the semantic definition of the new expression: _i = iter(EXPR) _thrown_exc = None try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: _thrown_exc = _e _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: if _e is _thrown_exc: # Don't suppress StopIteration if it # was thrown in from outside the # generator raise _expr_result = _e.value finally: _m = getattr(_i, 'close', None) if _m is not None: _m() RESULT = _expr_result Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Antoine Pitrou

2:18 p.m.

Nick Coghlan <ncoghlan <at> gmail.com> writes:

...

Whether or not different people will find code using "yield from" difficult to understand or not will have more to do with their grasp of the concepts of cooperative multitasking in general more so than the underlying trickery involved in allowing truly nested generators.

I don't agree. Cooperative multitasking looks quite orthogonal to me to the complexity brought by this new statement. You can perfectly well "grasp the concepts of cooperative multitasking" without finding the semantics of this new statement easy to understand and remember. Hiding so many special cases behind a one-line statement does not help, IMO. And providing a commented version of the expansion does not really help either: it does not make the expansion easier to remember and replay in the case you have to debug something involving such a "yield from" statement. (remember, by the way, that a third-party package like greenlets already provides cooperative multitasking without any syntax addition, and that libraries like Twisted already have their own generator-based solution for cooperative multitasking, which AFAIR no one demonstrated would be improved by the new statement. I'm not sure where the urgency is, and I don't see any compelling use case.)

Greg Ewing

9:54 p.m.

Antoine Pitrou wrote:

...

Do we really want to add a syntactic feature which has such a complicated expansion? I fear it will make code using "yield from" much more difficult to understand and audit.

As I've said before, I don't think the feature itself is difficult to understand. You're not meant to learn about it by reading the expansion -- that's only there to pin down all the details for language lawyers. For humans, almost all the important information is contained in one paragraph near the top: "When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the ``yield from`` expression. Furthermore, the subgenerator is allowed to execute a ``return`` statement with a value, and that value becomes the value of the ``yield from`` expression." Armed with this perspective, do you still think there will be difficulty in understanding or auditing code? -- Greg

Antoine Pitrou

3:09 p.m.

Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:

...

"When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the ``yield from`` expression. Furthermore, the subgenerator is allowed to execute a ``return`` statement with a value, and that value becomes the value of the ``yield from`` expression."

If it's really enough to understand and debug all corner cases of using "yield from", then fair enough. (I still don't like the PEP and feel it's much too specialized for a new syntactic feature. The language should try to be obvious rather than clever, IMO)

Greg Ewing

12:34 a.m.

Antoine Pitrou wrote:

...

If it's really enough to understand and debug all corner cases of using "yield from", then fair enough.

In the case where the subiterator is another generator and isn't shared, it's intended to be a precise and complete specification. That covers the vast majority of the use cases I have in mind. Most of the complexities arise from trying to pin down what happens when the subiterator isn't a generator, or is being shared by other code. I don't know how the specification could be made any simpler for those cases while still being complete. Even so, the intention is that if you understand the semantics in the generator case, the behaviour in the other cases should be something reasonable and unsurprising. I certainly don't expect users to memorize either the expansion or the full text of the English explanation. -- Greg

Guido van Rossum

8:44 p.m.

On Sat, Mar 21, 2009 at 2:54 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

Antoine Pitrou wrote:

...
Do we really want to add a syntactic feature which has such a complicated expansion? I fear it will make code using "yield from" much more difficult to understand and audit.

As I've said before, I don't think the feature itself is difficult to understand. You're not meant to learn about it by reading the expansion -- that's only there to pin down all the details for language lawyers.

For humans, almost all the important information is contained in one paragraph near the top:

"When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the ``yield from`` expression. Furthermore, the subgenerator is allowed to execute a ``return`` statement with a value, and that value becomes the value of the ``yield from`` expression."

Armed with this perspective, do you still think there will be difficulty in understanding or auditing code?

Well, hmm... I've been out of the loop due to other commitments (sorry), but I really don't like to have things whose semantics is defined in terms of code inlining -- even if you don't mean that as the formal semantics but just as a mnemonic hint. It causes all sorts of confusion about scopes. What happened to the first-order approximation "yield from X" means roughly the same as "for _x in X: yield x" ? The more specialized semantics in some cases can probably be put off until later in the document. FWIW I am okay with the notion that if the immediate subiterator returns a value, that value becomes the value of the yield-from-expression. Suitable semantics that make this effect pass through multiple layers of sub-iterators are fine too. But the exact semantics in the light of try/except or try/finally blocks on the stack are incredibly (perhaps impossibly) tricky to get right -- and it probably doesn't matter all that much what exactly happens as long as it's specified in sufficient detail that different implementations behave the same way (apart from obvious GC differences, alas). -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

11:02 p.m.

Guido van Rossum wrote:

...

I really don't like to have things whose semantics is defined in terms of code inlining -- even if you don't mean that as the formal semantics but just as a mnemonic hint.

Think about it the other way around, then. Take any chunk of code containing a yield, factor it out into a separate function (using whatever techniques you would normally use when performing such a refactoring to deal with references to variables in the surrounding scope) and call it using yield-from. The result should be the same as the original unfactored code. That's the fundamental reason behind all of this -- to make such refactorings possible in a straightforward way.

...

What happened to the first-order approximation

"yield from X" means roughly the same as "for _x in X: yield x"

Everybody's reaction to that when it's been suggested before has been "that's trivial, why bother?" So I've been trying to present it in a way that doesn't make it appear so trivial. Also, my understanding is that a PEP is not meant to be a tutorial for naive users, but a document for communicating ideas between core Python developers, who are presumably savvy enough not to need such watered-down material. But I'll be happy to add a paragraph about this at the beginning if you think it would help.

...

But the exact semantics in the light of try/except or try/finally blocks on the stack are incredibly (perhaps impossibly) tricky to get right -- and it probably doesn't matter all that much what exactly happens as long as it's specified in sufficient detail that different implementations behave the same way (apart from obvious GC differences, alas).

This is part of the reason I've been emphasising the inlining principle. When pondering what should happen in such cases, I've been able to think to myself "What would happen if the subgenerator were inlined?" Most of the time that makes the answer fairly obvious, at least in the case where the subiterator is another generator. Then it's a matter of generalising it to other iterators. -- Greg

Guido van Rossum

11:23 p.m.

On Tue, Mar 24, 2009 at 4:02 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

Guido van Rossum wrote:

...
I really don't like to have things whose semantics is defined in terms of code inlining -- even if you don't mean that as the formal semantics but just as a mnemonic hint.

Think about it the other way around, then. Take any chunk of code containing a yield, factor it out into a separate function (using whatever techniques you would normally use when performing such a refactoring to deal with references to variables in the surrounding scope) and call it using yield-from. The result should be the same as the original unfactored code.

The way I think of it, that refactoring has nothing to do with yield-from. It's not just variable references -- I used "scope" as a shorthand for everything that can be done within a function body, including control flow: try/except/finally, continue/break/raise/return.

...

That's the fundamental reason behind all of this -- to make such refactorings possible in a straightforward way.

Well, it solves one particular detail.

...

...
What happened to the first-order approximation

"yield from X" means roughly the same as "for _x in X: yield x"

Everybody's reaction to that when it's been suggested before has been "that's trivial, why bother?" So I've been trying to present it in a way that doesn't make it appear so trivial.

Maybe you're confusing motivation with explanation? That feedback seems to tell me that the *motivation* needs more work; but IMO the *explanation* should start with this simple model and then expand upon the edge cases.

...

Also, my understanding is that a PEP is not meant to be a tutorial for naive users, but a document for communicating ideas between core Python developers, who are presumably savvy enough not to need such watered-down material.

Not quite. PEPs aren't *just* for core developers -- they are also for communicating to (savvy) developers outside the core group. A good PEP needs to summarize both the motivation and specification concisely so prospective readers can quickly determine what it is about, and whether they care.

...

But I'll be happy to add a paragraph about this at the beginning if you think it would help.

...
But the exact semantics in the light of try/except or try/finally blocks on the stack are incredibly (perhaps impossibly) tricky to get right -- and it probably doesn't matter all that much what exactly happens as long as it's specified in sufficient detail that different implementations behave the same way (apart from obvious GC differences, alas).

This is part of the reason I've been emphasising the inlining principle. When pondering what should happen in such cases, I've been able to think to myself "What would happen if the subgenerator were inlined?" Most of the time that makes the answer fairly obvious, at least in the case where the subiterator is another generator. Then it's a matter of generalising it to other iterators.

This is a good way of thinking about use cases, because it helps deciding how the edge cases should be specified that the simplest model (my one-liner above) doesn't answer in a useful way. But it should not be confused with an explanation of the semantics. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

6:03 a.m.

Guido van Rossum wrote:

...

The way I think of it, that refactoring has nothing to do with yield-from.

I'm not sure what you mean by that. Currently it's *impossible* to factor out code containing a yield. Providing a way to do that is what led me to invent this particular version of yield-from in the first place. I wanted a way of writing suspendable functions that can call each other easily. (You may remember I originally wanted to call it "call".) Then I noticed that it would also happen to provide the functionality of earlier "yield from" suggestions, so I adopted that name. But for me, factorability has always been the fundamental idea, and the equivalence, in one particular restricted situation, to a for loop containing a yield is just a nice bonus. That's what I've tried to get across in the PEP, and it's the reason I've presented things in the way I have.

...

It's not just variable references -- I used "scope" as a shorthand for everything that can be done within a function body, including control flow: try/except/finally, continue/break/raise/return.

Same answer applies -- use the usual techniques. When I talk about inlining, I mean inlining the *functionality* of the code, not its literal text. I'm leaving the reader to imagine performing the necessary transformations to preserve the semantics.

...

Maybe you're confusing motivation with explanation? That feedback seems to tell me that the *motivation* needs more work; but IMO the *explanation* should start with this simple model and then expand upon the edge cases.

Perhaps what I should do is add a Motivation section before the Proposal and move some of the material from the beginning of the Rationale sectiomn there. -- Greg

P.J. Eby

2:26 p.m.

At 06:03 PM 3/25/2009 +1200, Greg Ewing wrote:

...

I wanted a way of writing suspendable functions that can call each other easily. (You may remember I originally wanted to call it "call".) Then I noticed that it would also happen to provide the functionality of earlier "yield from" suggestions, so I adopted that name.

I still don't see what you gain from making this syntax, vs. putting something like this in the stdlib (rough sketch): class Task(object): def __init__(self, geniter): self.stack = [geniter] def __iter__(self): return self def send(self, value=None): if not self.stack: raise RuntimeError("Can't resume completed task") return self._step(value) send = next def _step(self, value=None, exc_info=()): while self.stack: try: it = self.stack[-1] if exc_info: try: rv = it.throw(*exc_info) finally: exc_info = () elif value is not None: rv = it.send(value) else: rv = it.next() except: value = None exc_info = sys.exc_info() if exc_info[0] is StopIteration: exc_info = () # not really an error self.pop() else: value, exc_info = yield_to(rv, self) else: if exc_info: raise exc_info[0], exc_info[1], exc_info[2] else: return value def throw(self, *exc_info): if not self.stack: raise RuntimeError("Can't resume completed task") return self._step(None, exc_info) def push(self, geniter): self.stack.append(geniter) return None, () def pop(self, value=None): if self.stack: it = self.stack.pop() if hasattr(it, 'close'): try: it.close() except: return None, sys.exc_info() return value, () @classmethod def factory(cls, func): def decorated(*args, **kw): return cls(func(*args, **kw)) return decorated def yield_to(rv, task): # This could/should be a generic function, to allow yielding to # deferreds, sockets, timers, and other custom objects if hasattr(rv, 'next'): return task.push(rv) elif isinstance(rv, Return): return task.pop(rv.value) else: return rv, () class Return(object): def __init__(self, value=None): self.value = value @Task.factory def sample_task(arg1, another_arg): # blah blah something = (yield subtask(...)) yield Return(result) def subtask(...): ... yield Return(myvalue) The trampoline (the _step() method) handles the co-operative aspects, and modifying the yield_to() function allows you to define how yielded values are processed. By default, they're sent back into the generator that yields them, but you can pass a Return() to terminate the generator and pass the value up to the calling generator. Yielding another generator, on the other hand, "calls" that generator within the current task, and the same rules apply. Is there some reason why this won't do what you want, and can't be modified to do so? If so, that should be part of the PEP, as IMO it otherwise lacks motivation for a language feature vs. say, a stdlib module. If 'yield_to' is a generic function or at least supports registration of some kind, a feature like this would be interoperable with a wide variety of frameworks -- you could register deferreds and delayed calls and IO objects from Twisted, for example. So it's not like the feature would be creating an entire new framework of its own. Rather, it'd be a front-end to whatever framework (or no framework) you're using.

Guido van Rossum

4:24 a.m.

ISTR that the motivation for adding new syntax is that the best you can do using a trampoline library is still pretty cumbersome to use when you have to write a lot of tasks and subtasks, and when using tasks is just a tool for getting things done rather than an end goal in itself. I agree that the motivation and the comparison should be added to the PEP (perhaps moving the trampoline sample *implementation* to a reference or an appendix, since it is only the appearance of the trampoline-*using* code that matters). --Guido On Wed, Mar 25, 2009 at 7:26 AM, P.J. Eby <pje@telecommunity.com> wrote:

...

At 06:03 PM 3/25/2009 +1200, Greg Ewing wrote:

...
I wanted a way of writing suspendable functions that can call each other easily. (You may remember I originally wanted to call it "call".) Then I noticed that it would also happen to provide the functionality of earlier "yield from" suggestions, so I adopted that name.

I still don't see what you gain from making this syntax, vs. putting something like this in the stdlib (rough sketch):

class Task(object): def __init__(self, geniter): self.stack = [geniter]

def __iter__(self): return self

def send(self, value=None): if not self.stack: raise RuntimeError("Can't resume completed task") return self._step(value)

send = next

def _step(self, value=None, exc_info=()): while self.stack: try: it = self.stack[-1] if exc_info: try: rv = it.throw(*exc_info) finally: exc_info = () elif value is not None: rv = it.send(value) else: rv = it.next() except: value = None exc_info = sys.exc_info() if exc_info[0] is StopIteration: exc_info = () # not really an error self.pop() else: value, exc_info = yield_to(rv, self) else: if exc_info: raise exc_info[0], exc_info[1], exc_info[2] else: return value

def throw(self, *exc_info): if not self.stack: raise RuntimeError("Can't resume completed task") return self._step(None, exc_info)

def push(self, geniter): self.stack.append(geniter) return None, ()

def pop(self, value=None): if self.stack: it = self.stack.pop() if hasattr(it, 'close'): try: it.close() except: return None, sys.exc_info() return value, ()

@classmethod def factory(cls, func): def decorated(*args, **kw): return cls(func(*args, **kw)) return decorated

def yield_to(rv, task): # This could/should be a generic function, to allow yielding to # deferreds, sockets, timers, and other custom objects if hasattr(rv, 'next'): return task.push(rv) elif isinstance(rv, Return): return task.pop(rv.value) else: return rv, ()

class Return(object): def __init__(self, value=None): self.value = value

@Task.factory def sample_task(arg1, another_arg): # blah blah something = (yield subtask(...))

yield Return(result)

def subtask(...): ... yield Return(myvalue)

The trampoline (the _step() method) handles the co-operative aspects, and modifying the yield_to() function allows you to define how yielded values are processed. By default, they're sent back into the generator that yields them, but you can pass a Return() to terminate the generator and pass the value up to the calling generator. Yielding another generator, on the other hand, "calls" that generator within the current task, and the same rules apply.

Is there some reason why this won't do what you want, and can't be modified to do so? If so, that should be part of the PEP, as IMO it otherwise lacks motivation for a language feature vs. say, a stdlib module. If 'yield_to' is a generic function or at least supports registration of some kind, a feature like this would be interoperable with a wide variety of frameworks -- you could register deferreds and delayed calls and IO objects from Twisted, for example. So it's not like the feature would be creating an entire new framework of its own. Rather, it'd be a front-end to whatever framework (or no framework) you're using.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum

4:39 a.m.

On Tue, Mar 24, 2009 at 11:03 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

Guido van Rossum wrote:

...
The way I think of it, that refactoring has nothing to do with yield-from.

I'm not sure what you mean by that. Currently it's *impossible* to factor out code containing a yield.

That's stating it a little too strongly. Phillip has shown how with judicious use of decorators and helper classes you can get a reasonable approximation, and I think Twisted uses something like this, so it's not just theory. I think the best you can do without new syntax though is still pretty cumbersome and brittle, which is why I have encouraged your PEP.

...

Providing a way to do that is what led me to invent this particular version of yield-from in the first place.

I wanted a way of writing suspendable functions that can call each other easily. (You may remember I originally wanted to call it "call".) Then I noticed that it would also happen to provide the functionality of earlier "yield from" suggestions, so I adopted that name.

But for me, factorability has always been the fundamental idea, and the equivalence, in one particular restricted situation, to a for loop containing a yield is just a nice bonus.

That's what I've tried to get across in the PEP, and it's the reason I've presented things in the way I have.

That's all good. I just don't think that a presentation in terms of code in-lining is a good idea. That's not how we explain functions either. We don't say "the function call means the same as when we wrote the body of the function in-line here." It's perhaps a game with words, but it's important to me not to give that impression, since some languages *do* work that way (e.g. macro languages and Algol-60), but Python *doesn't*.

...

...
It's not just variable references -- I used "scope" as a shorthand for everything that can be done within a function body, including control flow: try/except/finally, continue/break/raise/return.

Same answer applies -- use the usual techniques.

When I talk about inlining, I mean inlining the *functionality* of the code, not its literal text. I'm leaving the reader to imagine performing the necessary transformations to preserve the semantics.

Yeah, so I'm asking you to use a different word, since "inlining" to me has pretty strong connotations of textual substitution.

...

...
Maybe you're confusing motivation with explanation? That feedback seems to tell me that the *motivation* needs more work; but IMO the *explanation* should start with this simple model and then expand upon the edge cases.

Perhaps what I should do is add a Motivation section before the Proposal and move some of the material from the beginning of the Rationale sectiomn there.

Yeah, I think it can easily be saved by changing the PEP without changing the specs of the proposal. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

8:32 a.m.

Guido van Rossum wrote:

...

That's all good. I just don't think that a presentation in terms of code in-lining is a good idea.

I was trying to describe it in a way that would give some insight into *why* the various aspects of the formal definition are the way they are. The inlining concept seemed like an elegant way of doing that. However, I've since realized that it's not quite as unambiguous as I thought it was when a return value is involved. I'll see if I can find another approach.

...

some languages *do* work that way (e.g. macro languages and Algol-60),

Algol-60 doesn't actually work that way, they just used a similar trick to define certain aspects of the semantics (although in that case I agree there were better ways they could have defined it). I'm asking you to use a different word, since "inlining" to

...

me has pretty strong connotations of textual substitution.

That's not what it usually means, as far as I can see. When you declare a function 'inline' in C, you're not asking for a blind textual substitution. Rather, you're asking the compiler to generate whatever code is needed to get the same effect as an actual call. -- Greg

Antoine Pitrou

10:56 a.m.

Guido van Rossum <guido <at> python.org> writes:

...

That's stating it a little too strongly. Phillip has shown how with judicious use of decorators and helper classes you can get a reasonable approximation, and I think Twisted uses something like this, so it's not just theory. I think the best you can do without new syntax though is still pretty cumbersome and brittle, which is why I have encouraged your PEP.

It remains to be seen whether Twisted and other libraries (Kamaelia?) can benefit from this PEP. There seems to be a misunderstanding as to how generators are used in Twisted. There isn't a global "trampoline" to schedule generators around. Instead, generators are wrapped with a decorator (*) which collects each yielded value (it's a Deferred object) and attaches to it a callback which resumes (using send()) the execution of the generator whenever the Deferred finally gets its value. The wrapped generator, in turn, looks like a normal Deferred-returning function to outside code. Therefore, there is no nesting problem and "yield from" doesn't seem to be useful here. This has been confirmed to me by a Twisted developer on IRC (he pointed out, however, a streaming XML parser where "yield from" could save a couple of lines of code). (*) inlineCallbacks: http://twistedmatrix.com/documents/8.2.0/api/twisted.internet.defer.html#inl... http://enthusiasm.cozy.org/archives/2009/03/python-twisteds-inlinecallbacks Regards Antoine.

P.J. Eby

5:19 p.m.

At 10:56 AM 3/26/2009 +0000, Antoine Pitrou wrote:

...

Guido van Rossum <guido <at> python.org> writes:

...
That's stating it a little too strongly. Phillip has shown how with judicious use of decorators and helper classes you can get a reasonable approximation, and I think Twisted uses something like this, so it's not just theory. I think the best you can do without new syntax though is still pretty cumbersome and brittle, which is why I have encouraged your PEP.

It remains to be seen whether Twisted and other libraries (Kamaelia?) can benefit from this PEP.

They don't get any new features, and would require (possibly significant) changes in order to be able to take advantage of the syntax. And they *still* wouldn't be able to do away with their trampolines -- the new trampolines would just be able to avoid the need for a generator stack, if they previously had one to begin with. From your description, it sounds like Twisted's version of this doesn't even use a stack. (Note: by "trampoline" I mean, "thing that processes yielded values and manages the resumption of the generator", which need not be global. The example trampoline I posted earlier is also implemented as a decorator, and could be trivially extended via a lookup table to handle deferreds, delayed calls, or whatever else you wanted it to support as yield targets.)

Guido van Rossum

7:27 p.m.

On Thu, Mar 26, 2009 at 10:19 AM, P.J. Eby <pje@telecommunity.com> wrote:

...

At 10:56 AM 3/26/2009 +0000, Antoine Pitrou wrote:

...
Guido van Rossum <guido <at> python.org> writes:

...
That's stating it a little too strongly. Phillip has shown how with judicious use of decorators and helper classes you can get a reasonable approximation, and I think Twisted uses something like this, so it's not just theory. I think the best you can do without new syntax though is still pretty cumbersome and brittle, which is why I have encouraged your PEP.

It remains to be seen whether Twisted and other libraries (Kamaelia?) can benefit from this PEP.

They don't get any new features, and would require (possibly significant) changes in order to be able to take advantage of the syntax.

Well yes if you want to maintain backwards compatibility there wouldn't be any advantage. The point of the new syntax is clearly that (eventually) they can stop having their own wrappers, decorators and helpers (for this purpose).

...

And they *still* wouldn't be able to do away with their trampolines -- the new trampolines would just be able to avoid the need for a generator stack, if they previously had one to begin with. From your description, it sounds like Twisted's version of this doesn't even use a stack.

Whether yo need a trampoline or not depends on other needs of a framework. There is some clear low-hanging fruit for Greg's proposal where no trampoline or helpers are needed -- but where currently refactoring complex code containing many yield statements is cumbersome due to the nee to write each "subroutine" call as "for x in subroutine(): yield x" -- being able to replace this with "yield from subroutine()" is a conceptual advantage to me that is not proportional to the number of characters saved.

...

(Note: by "trampoline" I mean, "thing that processes yielded values and manages the resumption of the generator", which need not be global. The example trampoline I posted earlier is also implemented as a decorator, and could be trivially extended via a lookup table to handle deferreds, delayed calls, or whatever else you wanted it to support as yield targets.)

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

5 a.m.

P.J. Eby wrote:

...

And they *still* wouldn't be able to do away with their trampolines --

It's not really about doing away with trampolines anyway. You still need at least one trampoline-like thing at the top. What you do away with is the need for creating special objects to yield, and the attendant syntactic clumisiness and inefficiencies. -- Greg

Antoine Pitrou

10:56 a.m.

Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:

...

It's not really about doing away with trampolines anyway. You still need at least one trampoline-like thing at the top. What you do away with is the need for creating special objects to yield, and the attendant syntactic clumisiness and inefficiencies.

No you don't, not in the Twisted case. The fact that useful library routines return Deferred objects to which you add callbacks and errbacks is probably not going away, because it's a fundamental building block in Twisted, not a convenience for scheduling generators. As a matter of fact, the people whom this PEP is supposed to benefit haven't expressed a lot of enthusiasm right now. That's why it looks so academic. Regards Antoine.

Greg Ewing

11:19 a.m.

New subject: Revised**9 PEP on Yield-From

Draft 10 of the PEP. Removed the outer try-finally from the expansion and fixed it to re-raise GeneratorExit if the throw call raises StopIteration. -- Greg PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing <greg.ewing@canterbury.ac.nz> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed for a generator to delegate part of its operations to another generator. This allows a section of code containing 'yield' to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Motivation ========== A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this can be performed without much difficulty using a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more difficult. As will be seen later, the necessary code is very complicated, and it is tricky to handle all the corner cases correctly. A new syntax will be proposed to address this issue. In the simplest use cases, it will be equivalent to the above for-loop, but it will also handle the full range of generator behaviour, and allow generator code to be refactored in a simple and straightforward way. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from <expr> where <expr> is an expression evaluating to an iterable, from which an iterator is extracted. The iterator is run to exhaustion, during which time it yields and receives values directly to or from the caller of the generator containing the ``yield from`` expression (the "delegating generator"). Furthermore, when the iterator is another generator, the subgenerator is allowed to execute a ``return`` statement with a value, and that value becomes the value of the ``yield from`` expression. In general, the semantics can be described in terms of the iterator protocol as follows: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are passed directly to the iterator. If the sent value is None, the iterator's ``next()`` method is called. If the sent value is not None, the iterator's ``send()`` method is called. Any exception resulting from attempting to call ``next`` or ``send`` is raised in the delegating generator. * Exceptions passed to the ``throw()`` method of the delegating generator are forwarded to the ``throw()`` method of the iterator. If the iterator does not have a ``throw()`` method, its ``close()`` method is called if it has one, then the thrown-in exception is raised in the delegating generator. Any exception resulting from attempting to call these methods (apart from one case noted below) is raised in the delegating generator. * The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. * ``return expr`` in a generator causes ``StopIteration(expr)`` to be raised. Fine Details ------------ The implicit GeneratorExit resulting from closing the delegating generator is treated as though it were passed in using ``throw()``. An iterator having a ``throw()`` method is expected to recognize this as a request to finalize itself. If a call to the iterator's ``throw()`` method raises a StopIteration exception, and it is *not* the same exception object that was thrown in, and the original exception was not GeneratorExit, then the value of the new exception is returned as the value of the ``yield from`` expression and the delegating generator is resumed. Enhancements to StopIteration ----------------------------- For convenience, the ``StopIteration`` exception will be given a ``value`` attribute that holds its first argument, or None if there are no arguments. Formal Semantics ---------------- 1. The statement :: RESULT = yield from EXPR is semantically equivalent to :: _i = iter(EXPR) try: _y = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _s = yield _y except: _m = getattr(_i, 'throw', None) if _m is not None: _x = sys.exc_info() try: _y = _m(*_x) except StopIteration, _e: if _e is _x[1] or isinstance(_x[1], GeneratorExit): raise else: _r = _e.value break else: _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _s is None: _y = _i.next() else: _y = _i.send(_s) except StopIteration, _e: _r = _e.value break RESULT = _r except that implementations are free to cache bound methods for the 'next', 'send' and 'throw' methods of the iterator upon first use. 2. In a generator, the statement :: return value is semantically equivalent to :: raise StopIteration(value) except that, as currently, the exception cannot be caught by ``except`` clauses within the returning generator. 3. The StopIteration exception behaves as though defined thusly: :: class StopIteration(Exception): def __init__(self, *args): if len(args) > 0: self.value = args[0] else: self.value = None Exception.__init__(self, *args) Rationale ========= The Refactoring Principle ------------------------- The rationale behind most of the semantics presented above stems from the desire to be able to refactor generator code. It should be possible to take an section of code containing one or more ``yield`` expressions, move it into a separate function (using the usual techniques to deal with references to variables in the surrounding scope, etc.), and call the new function using a ``yield from`` expression. The behaviour of the resulting compound generator should be, as far as possible, exactly the same as the original unfactored generator in all situations, including calls to ``next()``, ``send()``, ``throw()`` and ``close()``. The semantics in cases of subiterators other than generators has been chosen as a reasonable generalization of the generator case. Finalization ------------ There was some debate as to whether explicitly finalizing the delegating generator by calling its ``close()`` method while it is suspended at a ``yield from`` should also finalize the subiterator. An argument against doing so is that it would result in premature finalization of the subiterator if references to it exist elsewhere. Consideration of non-refcounting Python implementations led to the decision that this explicit finalization should be performed, so that explicitly closing a factored generator has the same effect as doing so to an unfactored one in all Python implementations. The assumption made is that, in the majority of use cases, the subiterator will not be shared. The rare case of a shared subiterator can be accommodated by means of a wrapper that blocks ``throw()`` and ``close()`` calls, or by using a means other than ``yield from`` to call the subiterator. Generators as Threads --------------------- A motivation for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become, in the worst case, O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow an extra attribute or a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no acceptable alternative has so far been proposed. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. It has been suggested that a different exception, such as GeneratorReturn, should be used instead of StopIteration to return a value. However, no convincing practical reason for this has been put forward, and the addition of a ``value`` attribute to StopIteration mitigates any difficulties in extracting a return value from a StopIteration exception that may or may not have one. Also, using a different exception would mean that, unlike ordinary functions, 'return' without a value in a generator would not be equivalent to 'return None'. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with the full generator protocol, this proposal provides considerably more benefit. Additional Material =================== Some examples of the use of the proposed syntax are available, and also a prototype implementation based on the first optimisation outlined above. `Examples and Implementation`_ .. _Examples and Implementation: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

Jesse Noller

12:33 p.m.

On Fri, Mar 27, 2009 at 5:56 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:

...

Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:

...
It's not really about doing away with trampolines anyway. You still need at least one trampoline-like thing at the top. What you do away with is the need for creating special objects to yield, and the attendant syntactic clumisiness and inefficiencies.

No you don't, not in the Twisted case. The fact that useful library routines return Deferred objects to which you add callbacks and errbacks is probably not going away, because it's a fundamental building block in Twisted, not a convenience for scheduling generators.

As a matter of fact, the people whom this PEP is supposed to benefit haven't expressed a lot of enthusiasm right now. That's why it looks so academic.

Regards

Antoine.

That's because most of us who might like this have been patently avoiding this thread. I like the syntax, I'm iffy on the exception other than stop iteration (but I lost track on what was decided on this) and I would like to see this go in. I think this is going to be a power user feature for awhile, but I will like to see the libraries that will come of this, I think this does enhance things. Also, I know David Beazley did a tutorial here at pycon on implementing coroutines and I'd be interested to see what he thinks of this as well. I'll see if I can get his opinion. -jesse

Paul Moore

1:03 p.m.

2009/3/27 Jesse Noller <jnoller@gmail.com>:

...

That's because most of us who might like this have been patently avoiding this thread.

I like the syntax, I'm iffy on the exception other than stop iteration (but I lost track on what was decided on this) and I would like to see this go in. I think this is going to be a power user feature for awhile, but I will like to see the libraries that will come of this, I think this does enhance things.

Agreed on all counts. Paul.

Michele Simionato

3:26 a.m.

On Fri, Mar 27, 2009 at 1:33 PM, Jesse Noller <jnoller@gmail.com> wrote:

...

Antoine Pitrou:

...
As a matter of fact, the people whom this PEP is supposed to benefit haven't expressed a lot of enthusiasm right now. That's why it looks so academic. That's because most of us who might like this have been patently avoiding this thread.

I have been avoiding this thread too - even if I have implemented my own trampoline as everybody else here - because I had nothing to say that was not said already here. But just to add a data point, let me say that I agree with Eby. I am 0+ on the syntax, but please keep the hidden logic simple and absolutely do NOT add confusion between yield and return. Use yield Return(value) or raise SomeException(value), as you like. The important thing for me is to have a trampoline in the standard library, not the syntax. Michele Simionato

Nick Coghlan

9:34 a.m.

Michele Simionato wrote:

...

On Fri, Mar 27, 2009 at 1:33 PM, Jesse Noller <jnoller@gmail.com> wrote:

...
Antoine Pitrou:

...
As a matter of fact, the people whom this PEP is supposed to benefit haven't expressed a lot of enthusiasm right now. That's why it looks so academic. That's because most of us who might like this have been patently avoiding this thread.

I have been avoiding this thread too - even if I have implemented my own trampoline as everybody else here - because I had nothing to say that was not said already here. But just to add a data point, let me say that I agree with Eby. I am 0+ on the syntax, but please keep the hidden logic simple and absolutely do NOT add confusion between yield and return. Use yield Return(value) or raise SomeException(value), as you like.

I still think raise is out due to the fact that it would trigger subsequent except clauses. Guido has (sensibly) ruled out raising StopIteration and complaining if it has value in old code, since there is too much code which cases StopIteration *without* performing such a check. If those two points are accepted as valid, then that leaves the two options as being: 1. Add a new GeneratorReturn exception that will escape from existing code that only traps StopIteration. The only real downside of this is that either "return" and "return None" will mean different things in generators (unlike functions) or else "return None" will need to be special cased to raise StopIteration in the calling code rather than raising GeneratorReturn(None). The latter approach is probably preferable if this option is chosen - any code for dealing with "generators as coroutines" is going to have to deal with the possibility of bare returns and falling off the end of the function anyway, so the special case really wouldn't be that special. 2. In addition to the "yield from" syntax for delegating to a subgenerator, also add new syntax for returning values from subgenerators so that the basic "return X" can continue to trigger SyntaxError. Since option 2 would most likely lead to a bikeshed discussion of epic proportions, I'm currently a fan of option 1 ;) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Guido van Rossum

11:52 a.m.

On Sat, Mar 28, 2009 at 4:34 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

I still think raise is out due to the fact that it would trigger subsequent except clauses. Guido has (sensibly) ruled out raising StopIteration and complaining if it has value in old code, since there is too much code which cases StopIteration *without* performing such a check.

If those two points are accepted as valid, then that leaves the two options as being:

1. Add a new GeneratorReturn exception that will escape from existing code that only traps StopIteration. The only real downside of this is that either "return" and "return None" will mean different things in generators (unlike functions) or else "return None" will need to be special cased to raise StopIteration in the calling code rather than raising GeneratorReturn(None). The latter approach is probably preferable if this option is chosen - any code for dealing with "generators as coroutines" is going to have to deal with the possibility of bare returns and falling off the end of the function anyway, so the special case really wouldn't be that special.

It seems so indeed.

...

2. In addition to the "yield from" syntax for delegating to a subgenerator, also add new syntax for returning values from subgenerators so that the basic "return X" can continue to trigger SyntaxError.

Since option 2 would most likely lead to a bikeshed discussion of epic proportions, I'm currently a fan of option 1 ;)

Me too. It also seems option 2 doesn't help us decide what it should do: I still think that raising StopIteration(value) would be misleading to vanilla users of the generators. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

3:08 a.m.

Antoine Pitrou wrote:

...

There seems to be a misunderstanding as to how generators are used in Twisted. There isn't a global "trampoline" to schedule generators around. Instead, generators are wrapped with a decorator (*) which collects each yielded value (it's a Deferred object) and attaches to it a callback which resumes (using send()) the execution of the generator whenever the Deferred finally gets its value.

This sounds like an architecture that was developed to work around the lack of anything like yield-from in the language. You can't expect to improve something like that by stuffing yield-from into the existing framework, because the point of yield-from is to render the framework itself unnecessary. To take full advantage of it, you need to step back and re-design the whole thing in a different way. In the case of Twisted, I expect the new design would look a lot like my generator scheduling example. -- Greg

P.J. Eby

4:17 a.m.

At 04:08 PM 3/27/2009 +1300, Greg Ewing wrote:

...

You can't expect to improve something like that by stuffing yield-from into the existing framework, because the point of yield-from is to render the framework itself unnecessary.

But it doesn't. You still need *something* that processes the yielded values, since practical frameworks have various things to yield "to" - i/o, time, mouse clicks, whatever. Correctly dealing with the call stack part is tedious to implement, sure, but it's not really the focal point of a microthreading framework. Usually, you need to have some way to control which microthreads are actually to be executing, vs. the ones that are waiting for a particular time, an I/O operation, or some other sort of event. None of that stuff goes away just by taking care of the call stack.

Stefan Rank

10:53 a.m.

on 2009-03-27 05:17 P.J. Eby said the following:

...

At 04:08 PM 3/27/2009 +1300, Greg Ewing wrote:

...
You can't expect to improve something like that by stuffing yield-from into the existing framework, because the point of yield-from is to render the framework itself unnecessary.

But it doesn't. You still need *something* that processes the yielded values, since practical frameworks have various things to yield "to" - i/o, time, mouse clicks, whatever. Correctly dealing with the call stack part is tedious to implement, sure, but it's not really the focal point of a microthreading framework.

I can chime in here with a use case, if an unusual one. I implemented just such a framework based on generator syntax for my thesis work to model the behaviour of software agents as a collection of interacting activities (microprocesses). The top layer is based on Twisted (similar to its _inlineCallbacks) and different schedulers decide on what to do with yielded values. This is really very similar to Philip Eby's code, the main difference being that one uses a generic function yield_to as an extension point and the other one uses (subclasses of) Deferreds. You can handle the call stack in the Deferred-based case just as clumsily as in the other :-). And in my system, the "call stack" (i.e. the hierarchy of active microprocesses) and how it can be manipulated by the agent is actually the interesting part.

...

Usually, you need to have some way to control which microthreads are actually to be executing, vs. the ones that are waiting for a particular time, an I/O operation, or some other sort of event. None of that stuff goes away just by taking care of the call stack.

Yes. However, the valuable addition that an explicit yield from syntax would provide for my use case is a way to explicitly distinguish between subgenerators just for the sake of refactoring code vs. sub-"processes". I could remove quite some duplication from my current code. Additionally, as noted in the PEP, it would open the path for optimisations of the refactoring cases. I also think that a separation of handling the generator call stack and handling yielded values improves the situation for scheduling/trampoline authors conceptually. Just my 0.02€ cheers, stefan

Steve Holden

2:22 a.m.

Greg Ewing wrote:

...

Guido van Rossum wrote:

...
I really don't like to have things whose semantics is defined in terms of code inlining -- even if you don't mean that as the formal semantics but just as a mnemonic hint.

Think about it the other way around, then. Take any chunk of code containing a yield, factor it out into a separate function (using whatever techniques you would normally use when performing such a refactoring to deal with references to variables in the surrounding scope) and call it using yield-from. The result should be the same as the original unfactored code.

That's the fundamental reason behind all of this -- to make such refactorings possible in a straightforward way.

...
What happened to the first-order approximation

"yield from X" means roughly the same as "for _x in X: yield x"

Everybody's reaction to that when it's been suggested before has been "that's trivial, why bother?" So I've been trying to present it in a way that doesn't make it appear so trivial.

There is one non-trivial extension that I've been chewing over for a while. What if you want to yield not the values from the generator but some function of those values? The present proposal appears to have no way to specify that. What about extending the syntax somewhat to yield expr for x from X The idea is that x should be a a bound variable in expr, but the "expr for x" could be optional to yield the existing proposal as a degenerate case.

...

Also [...]

regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ Want to know? Come to PyCon - soon! http://us.pycon.org/

P.J. Eby

3:35 a.m.

At 10:22 PM 3/24/2009 -0400, Steve Holden wrote:

...

There is one non-trivial extension that I've been chewing over for a while. What if you want to yield not the values from the generator but some function of those values? The present proposal appears to have no way to specify that. What about extending the syntax somewhat to

yield expr for x from X

The idea is that x should be a a bound variable in expr, but the "expr for x" could be optional to yield the existing proposal as a degenerate case.

That would be spelled: yield from (expr for x in X) And the compiler could optionally optimize away the genexpr. Assuming, of course, that this is considered valuable enough to implement in the first place, which I don't think it is... especially not with the return bit factored in. Now, if somebody came up with a different way to spell the extra value return, I wouldn't object as much to that part. I can just see people inadvertently writing 'return x' as a shortcut for 'yield x; return', and then having what seem like mysterious off-by-one errors, or being confused by receiving a generator object instead of their desired non-generator return value. It also seems weird that the only syntactically-supported way to get the generator's "return value" is to access it inside *another* generator... which *also* can't return the return value to anyone! But if it were spelled 'raise Return(value)' or 'raise StopIteration(value)' or something similar (or even had its own syntax!), I wouldn't object, as it would then be obvious how to get the value, and there could be no possible confusion with a regular return value. The unusual spelling would also signal that something unusual (i.e., multitasking) is taking place, similar to the way some frameworks use things like 'yield Return(value)' to signal the end of a task and its return value, in place of a value in the stream.

Guido van Rossum

3:52 a.m.

On Tue, Mar 24, 2009 at 8:35 PM, P.J. Eby <pje@telecommunity.com> wrote:

...

Now, if somebody came up with a different way to spell the extra value return, I wouldn't object as much to that part. I can just see people inadvertently writing 'return x' as a shortcut for 'yield x; return', and then having what seem like mysterious off-by-one errors, or being confused by receiving a generator object instead of their desired non-generator return value.

It also seems weird that the only syntactically-supported way to get the generator's "return value" is to access it inside *another* generator... which *also* can't return the return value to anyone!

But if it were spelled 'raise Return(value)' or 'raise StopIteration(value)' or something similar (or even had its own syntax!), I wouldn't object, as it would then be obvious how to get the value, and there could be no possible confusion with a regular return value.

The unusual spelling would also signal that something unusual (i.e., multitasking) is taking place, similar to the way some frameworks use things like 'yield Return(value)' to signal the end of a task and its return value, in place of a value in the stream.

I'm sympathetic to this point of view. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

6:15 a.m.

P.J. Eby wrote:

...

Now, if somebody came up with a different way to spell the extra value return, I wouldn't object as much to that part. I can just see people inadvertently writing 'return x' as a shortcut for 'yield x; return',

Well, they need to be educated not to do that. I'm not sure they'll need much education about this anyway. They've already been taught not to say 'return' when they mean 'yield', so I don't see why they should suddenly start doing so now. I'd be disappointed to lose that part of the proposal. Part of my philosophy is that suspendable functions should have the same rights and privileges as ordinary ones, and that includes the ability to return values using 'return'.

...

It also seems weird that the only syntactically-supported way to get the generator's "return value" is to access it inside *another* generator... which *also* can't return the return value to anyone!

Would you be happier if some syntactic way to do that were provided? It could perhaps be done by enhancing the part of the 'for' loop that gets executed upon normal termination of the iterator. for x in my_iter: do_something_with(x) else v: handle_return_value(v)

...

The unusual spelling would also signal that something unusual (i.e., multitasking) is taking place, similar to the way some frameworks use things like 'yield Return(value)' to signal the end of a task and its return value, in place of a value in the stream.

Difference in philosophy again. To me, the need for such an unusual construct when using these frameworks is a wart, not a feature. -- Greg

Nick Coghlan

1:22 p.m.

Greg Ewing wrote:

...

Would you be happier if some syntactic way to do that were provided?

It could perhaps be done by enhancing the part of the 'for' loop that gets executed upon normal termination of the iterator.

for x in my_iter: do_something_with(x) else v: handle_return_value(v)

I think something like that would actually make the PEP much stronger on this front - it would promote the idea of a "final value" for iterators as a more fundamental concept that can be worked with in a non-generator context. I'm also reminded of an idea that I believe existed in the early drafts of PEP 342: using "continue value" to invoke an iterator's send() method instead of next() as part of a normal for loop. With those two ideas combined, the PEP's "yield from" expansion could then look like: for x in EXPR: _v = yield x if _v is not None: continue _v else _r: RESULT = _r (If "continue None" was defined as invoking .next() instead of .send(None), then that loop body could be simplified to just "continue yield x". However, I think it is preferable to keep the bare 'continue' and dropping off the end of the loop as invoking next(), while "continue arg" invokes send(None), since the latter form clearly *expects* the iterator to have a send() method and it is best to emit the AttributeError immediately if the method isn't there) Strangely enough, I think proposing a more general change to the iteration model to include sending values into iterators and having an accessible "final value" may actually be easier to sell than trying to sell "yield from" as a pure generator construct with no more general statement level equivalent. Trying to sell the multi-stage function iteration model and the concise expression form for invoking them from another generator all at once is a lot to take in one gulp. I suspect that angle of attack would also make *testing* this kind of code far simpler as well. For example: for value, send_arg, expected in zip(gf_under_test(), send_args, expected_values): assertEqual(value, expected) continue send_arg else result: assertEqual(result, expected_result) I'm not actually sure how you would go about writing a test driver for that example multi-stage function *without* either making some kind of change to for loops or developing some rather ugly test code. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan

1:37 p.m.

Nick Coghlan wrote:

...

With those two ideas combined, the PEP's "yield from" expansion could then look like:

for x in EXPR: _v = yield x if _v is not None: continue _v else _r: RESULT = _r

Oops, got a little carried away there. Obviously, that doesn't handle thrown in exceptions the way "yield from" is intended to. So even with an adjusted for loop the full semantic expansion of 'yield from' would still need to be defined directly in terms of try/except and method calls on the underlying iterator to get the desired exception handling characteristics. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Guido van Rossum

4:26 a.m.

On Wed, Mar 25, 2009 at 6:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

Greg Ewing wrote:

...
Would you be happier if some syntactic way to do that were provided?

It could perhaps be done by enhancing the part of the 'for' loop that gets executed upon normal termination of the iterator.

for x in my_iter: do_something_with(x) else v: handle_return_value(v)

I think something like that would actually make the PEP much stronger on this front - it would promote the idea of a "final value" for iterators as a more fundamental concept that can be worked with in a non-generator context.

Hold it right there. Or maybe I should say "in your dreams." Please don't stretch the scope of the PEP. It's not going to help your cause. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Nick Coghlan

4:51 a.m.

Guido van Rossum wrote:

...

On Wed, Mar 25, 2009 at 6:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
...
'for' loop that gets executed upon normal termination of the iterator.

for x in my_iter: do_something_with(x) else v: handle_return_value(v) I think something like that would actually make the PEP much stronger on

It could perhaps be done by enhancing the part of the this front - it would promote the idea of a "final value" for iterators as a more fundamental concept that can be worked with in a non-generator context.

Hold it right there. Or maybe I should say "in your dreams." Please don't stretch the scope of the PEP. It's not going to help your cause.

Yes, I now agree your suggestion of comparing and contrasting with PJE's simple trampoline example is a much better angle of attack. Although the PEP may still want to mention how one would write *tests* for these things. Will the test drivers themselves need to be generators participating in some kind of trampoline setup? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Greg Ewing

8:34 a.m.

Nick Coghlan wrote:

...

Although the PEP may still want to mention how one would write *tests* for these things. Will the test drivers themselves need to be generators participating in some kind of trampoline setup?

I don't see that tests are fundamentally different from any other code that wants to call a value-returning generator and get the value without becoming a generator itself. So if it's to be mentioned in the PEP at all, a general solution might as well be given (whether it's to use a trampoline or just write the necessary next() and except code). -- Greg

Greg Ewing

8:43 a.m.

Trying to think of a better usage example that combines send() with returning values, I've realized that part of the problem is that I don't actually know of any realistic uses for send() in the first place. Can anyone point me to any? Maybe it will help to inspire a better example. -- Greg

Greg Ewing

10:21 a.m.

New subject: Revised**8 PEP on Yield-From

Here's a new draft of the PEP. I've added a Motivation section and removed any mention of inlining. There is a new expansion that incorporates recent ideas, including the suggested handling of StopIteration raised by a throw() call (i.e. if it wasn't the one thrown in, treat it as a return value). Explicit finalization is performed if the delegating generator is closed, but not when the subiterator completes itself normally. ------------------------------------------------------------ PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing <greg.ewing@canterbury.ac.nz> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed for a generator to delegate part of its operations to another generator. This allows a section of code containing 'yield' to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Motivation ========== A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this can be performed without much difficulty using a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more difficult. As will be seen later, the necessary code is very complicated, and it is tricky to handle all the corner cases correctly. A new syntax will be proposed to address this issue. In the simplest use cases, it will be equivalent to the above for-loop, but it will also handle the full range of generator behaviour, and allow generator code to be refactored in a simple and straightforward way. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from <expr> where <expr> is an expression evaluating to an iterable, from which an iterator is extracted. The iterator is run to exhaustion, during which time it yields and receives values directly to or from the caller of the generator containing the ``yield from`` expression (the "delegating generator"). Furthermore, when the iterator is another generator, the subgenerator is allowed to execute a ``return`` statement with a value, and that value becomes the value of the ``yield from`` expression. In general, the semantics can be described in terms of the iterator protocol as follows: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are passed directly to the iterator. If the sent value is None, the iterator's ``next()`` method is called. If the sent value is not None, the iterator's ``send()`` method is called. Any exception resulting from attempting to call ``next`` or ``send`` is raised in the delegating generator. * Exceptions passed to the ``throw()`` method of the delegating generator are forwarded to the ``throw()`` method of the iterator. If the iterator does not have a ``throw()`` method, its ``close()`` method is called if it has one, then the thrown-in exception is raised in the delegating generator. Any exception resulting from attempting to call these methods (apart from one case noted below) is raised in the delegating generator. * The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. * ``return expr`` in a generator causes ``StopIteration(expr)`` to be raised. Fine Details ------------ The implicit GeneratorExit resulting from closing the delegating generator is treated as though it were passed in using ``throw()``. An iterator having a ``throw()`` method is expected to recognize this as a request to finalize itself. If a call to the iterator's ``throw()`` method raises a StopIteration exception, and it is *not* the same exception object that was thrown in, its value is returned as the value of the ``yield from`` expression and the delegating generator is resumed. Enhancements to StopIteration ----------------------------- For convenience, the ``StopIteration`` exception will be given a ``value`` attribute that holds its first argument, or None if there are no arguments. Formal Semantics ---------------- 1. The statement :: RESULT = yield from EXPR is semantically equivalent to :: _i = iter(EXPR) try: try: _y = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _s = yield _y except: _m = getattr(_i, 'throw', None) if _m is not None: _x = sys.exc_info() try: _y = _m(*_x) except StopIteration, _e: if _e is _x[1]: raise else: _r = _e.value break else: _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _s is None: _y = _i.next() else: _y = _i.send(_s) except StopIteration, _e: _r = _e.value break finally: del _i RESULT = _r except that implementations are free to cache bound methods for the 'next', 'send' and 'throw' methods of the iterator upon first use. 2. In a generator, the statement :: return value is semantically equivalent to :: raise StopIteration(value) except that, as currently, the exception cannot be caught by ``except`` clauses within the returning generator. 3. The StopIteration exception behaves as though defined thusly: :: class StopIteration(Exception): def __init__(self, *args): if len(args) > 0: self.value = args[0] else: self.value = None Exception.__init__(self, *args) Rationale ========= The Refactoring Principle ------------------------- The rationale behind most of the semantics presented above stems from the desire to be able to refactor generator code. It should be possible to take an section of code containing one or more ``yield`` expressions, move it into a separate function (using the usual techniques to deal with references to variables in the surrounding scope, etc.), and call the new function using a ``yield from`` expression. The behaviour of the resulting compound generator should be, as far as possible, exactly the same as the original unfactored generator in all situations, including calls to ``next()``, ``send()``, ``throw()`` and ``close()``. The semantics in cases of subiterators other than generators has been chosen as a reasonable generalization of the generator case. Finalization ------------ There was some debate as to whether explicitly finalizing the delegating generator by calling its ``close()`` method while it is suspended at a ``yield from`` should also finalize the subiterator. An argument against doing so is that it would result in premature finalization of the subiterator if references to it exist elsewhere. Consideration of non-refcounting Python implementations led to the decision that this explicit finalization should be performed, so that explicitly closing a factored generator has the same effect as doing so to an unfactored one in all Python implementations. The assumption made is that, in the majority of use cases, the subiterator will not be shared. The rare case of a shared subiterator can be accommodated by means of a wrapper that blocks ``throw()`` and ``send()`` calls, or by using a means other than ``yield from`` to call the subiterator. Generators as Threads --------------------- A motivation for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become, in the worst case, O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow an extra attribute or a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no acceptable alternative has so far been proposed. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. It has been suggested that a different exception, such as GeneratorReturn, should be used instead of StopIteration to return a value. However, no convincing practical reason for this has been put forward, and the addition of a ``value`` attribute to StopIteration mitigates any difficulties in extracting a return value from a StopIteration exception that may or may not have one. Also, using a different exception would mean that, unlike ordinary functions, 'return' without a value in a generator would not be equivalent to 'return None'. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with the full generator protocol, this proposal provides considerably more benefit. Additional Material =================== Some examples of the use of the proposed syntax are available, and also a prototype implementation based on the first optimisation outlined above. `Examples and Implementation`_ .. _Examples and Implementation: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg

Nick Coghlan

11:35 a.m.

New subject: Revised**8 PEP on Yield-From

Greg Ewing wrote:

...

Here's a new draft of the PEP. I've added a Motivation section and removed any mention of inlining.

I like this version a lot better - reading the two examples on your site helped as well.

...

There is a new expansion that incorporates recent ideas, including the suggested handling of StopIteration raised by a throw() call (i.e. if it wasn't the one thrown in, treat it as a return value).

The spec for GeneratorExit handling means that the if statement around the raise statement needs an extra condition: a thrown in GeneratorExit should *always* be reraised, even if the subgenerator converts it to StopIteration (which it is allowed to do by PEP 342 and the relevant documentation).

...

------------------------------------------------------------ In general, the semantics can be described in terms of the iterator protocol as follows:

"iterator protocol and generator API" or "generator protocol" (which is a phrase you already use later in the PEP) would be more accurate, as send() and throw() aren't part of the basic iterator protocol.

...

Fine Details ------------

The implicit GeneratorExit resulting from closing the delegating generator is treated as though it were passed in using ``throw()``. An iterator having a ``throw()`` method is expected to recognize this as a request to finalize itself.

If a call to the iterator's ``throw()`` method raises a StopIteration exception, and it is *not* the same exception object that was thrown in, its value is returned as the value of the ``yield from`` expression and the delegating generator is resumed.

As mentioned above, I believe this should be overruled in the case of GeneratorExit. Since correctly written generators are permitted to convert GeneratorExit to StopIteration, the 'yield from' expression should detect when that has happened and reraise the original exception.

...

Finalization ------------

There was some debate as to whether explicitly finalizing the delegating generator by calling its ``close()`` method while it is suspended at a ``yield from`` should also finalize the subiterator. An argument against doing so is that it would result in premature finalization of the subiterator if references to it exist elsewhere.

Consideration of non-refcounting Python implementations led to the decision that this explicit finalization should be performed, so that explicitly closing a factored generator has the same effect as doing so to an unfactored one in all Python implementations.

The assumption made is that, in the majority of use cases, the subiterator will not be shared. The rare case of a shared subiterator can be accommodated by means of a wrapper that blocks ``throw()`` and ``send()`` calls, or by using a means other than ``yield from`` to call the subiterator.

With the currently semantics (calling close() if throw() isn't available), it is also necessary to block close() in order to share an iterator. Given the conclusion that shared iterators are actually better handled by looping or explicit next() calls, I'm actually OK with that - really focusing 'yield from' specifically on the ability to factor monolithic generator functions into smaller components is probably a good idea, since mere iteration is already easy. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Greg Ewing

5 a.m.

New subject: Revised**8 PEP on Yield-From

Nick Coghlan wrote:

...

Since correctly written generators are permitted to convert GeneratorExit to StopIteration, the 'yield from' expression should detect when that has happened and reraise the original exception.

I'll have to think about that a bit, but you're probably right.

...

it is also necessary to block close() in order to share an iterator.

That's a typo -- I meant to say 'throw' and 'close' there, I think. -- Greg

Guido van Rossum

3:34 a.m.

New subject: Revised**8 PEP on Yield-From

On Thu, Mar 26, 2009 at 5:21 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

Here's a new draft of the PEP. I've added a Motivation section and removed any mention of inlining.

There is a new expansion that incorporates recent ideas, including the suggested handling of StopIteration raised by a throw() call (i.e. if it wasn't the one thrown in, treat it as a return value).

Explicit finalization is performed if the delegating generator is closed, but not when the subiterator completes itself normally.

Submitted to SVN. I'll try to critique later. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

P.J. Eby

5:12 p.m.

At 08:43 PM 3/26/2009 +1200, Greg Ewing wrote:

...

Trying to think of a better usage example that combines send() with returning values, I've realized that part of the problem is that I don't actually know of any realistic uses for send() in the first place.

Can anyone point me to any? Maybe it will help to inspire a better example.

Er, well, I don't know what anybody *else* wanted them for, but I wanted them to implement improved trampoline functions, vs. earlier Python versions. ;-) The trampoline example I gave uses send() in order to pass the return values from one generator back into another. Of course, the task object also has a send(), so if you do find another use case for send() in a co-operative context, it should be equally doable with the trampoline.

Nick Coghlan

9:19 a.m.

Greg Ewing wrote:

...

Nick Coghlan wrote:

...
Although the PEP may still want to mention how one would write *tests* for these things. Will the test drivers themselves need to be generators participating in some kind of trampoline setup?

I don't see that tests are fundamentally different from any other code that wants to call a value-returning generator and get the value without becoming a generator itself. So if it's to be mentioned in the PEP at all, a general solution might as well be given (whether it's to use a trampoline or just write the necessary next() and except code).

Agreed the problem is more general than just testing - but a test driver is potentially interesting in that you probably want the same test suite to be able to test both normal code and the cooperative multitasking code. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Greg Ewing

5:09 a.m.

Steve Holden wrote:

...

What about extending the syntax somewhat to

yield expr for x from X

I can't see much advantage that would give you over writing for x in X: yield expr There would be little or no speed advantage, since you would no longer be able to shortcut the intermediate generator during next(). -- Greg

P.J. Eby

4:25 p.m.

At 04:45 PM 3/21/2009 +1000, Nick Coghlan wrote:

...

I really like the PEP - it's a solid extension of the ideas introduced by PEP 342.

(Replying to you since I haven't seen any other thread on this) My concern is that allowing 'return value' in generators is going to be confusing, since it effectively causes the return value to "disappear" if you're not using it in this special way with some framework that takes advantage. However, if you *do* have some framework that takes advantage of generators to do microthreads, then it is most likely already written so as to have things like 'yield Return(value)' to signal a return, and to handle 'yield subgenerator()' without the use of additional syntax. So, I don't really see the point of the PEP. 'yield from' seems marginally useful, but I really dislike making it an expression, rather than a statement. The difference seems just a little too subtle, considering how radically different the behavior is. Overall, it has the feel of jamming a framework into the language, when doing the same thing in a library is pretty trivial. I'd almost rather see a standard or "reference" trampoline added to the stdlib (preferably with a way to register handling for specialized yielded types IO/scheduling hooks), than try to cram half a trampoline into the language itself.

Greg Ewing

10:21 p.m.

P.J. Eby wrote:

...

My concern is that allowing 'return value' in generators is going to be confusing, since it effectively causes the return value to "disappear" if you're not using it in this special way with some framework that takes advantage.

But part of all this is that you *don't* need a special framework to get the return value -- all you need is a caller that uses a yield-from statement. There are uses for that besides threading systems. -- Greg

Paul Moore

10:59 p.m.

2009/3/21 Greg Ewing <greg.ewing@canterbury.ac.nz>:

...

P.J. Eby wrote:

...
My concern is that allowing 'return value' in generators is going to be confusing, since it effectively causes the return value to "disappear" if you're not using it in this special way with some framework that takes advantage.

But part of all this is that you *don't* need a special framework to get the return value -- all you need is a caller that uses a yield-from statement. There are uses for that besides threading systems.

Can they be added to the PEP? Personally, I find the proposal appealing, and I don't find the semantics hard to understand (although certainly the expansion given in the "formal semantics" section makes my head hurt ;-)) but I don't see many actual reasons why it's useful. (My own use would most likely to be the trivial "for v in g: yield v" case). More motivating examples would help a lot. Paul.

P.J. Eby

12:52 a.m.

At 10:21 AM 3/22/2009 +1200, Greg Ewing wrote:

...

P.J. Eby wrote:

...
My concern is that allowing 'return value' in generators is going to be confusing, since it effectively causes the return value to "disappear" if you're not using it in this special way with some framework that takes advantage.

But part of all this is that you *don't* need a special framework to get the return value -- all you need is a caller that uses a yield-from statement. There are uses for that besides threading systems.

Such as? I've been wracking my brain trying to come up with any *other* occasion where I'd need -- or even find it useful -- to have one generator yield the contents of another generator to its caller, and then use a separate return value in itself. (I'm thus finding it hard to believe there's a non-contrived example that's not doing I/O, scheduling, or some other form of co-operative multitasking.) In any case, you didn't address the confusion issue: the inability of generators to return a value is there for a good reason, and adding a return value that doesn't actually return anywhere unless you use it in a yield-from expression -- an expression that both looks like a statement and has control-flow side-effects -- seems both over-complex and an invitation to confusion. This is different from plain yield expressions, in that plain yield expressions are *symmetric*: the value returned from the yield expression comes from the place where control flow is passed by the yield. That is, 'x = yield y' takes value y, passes control flow to the caller, and then returns a result from the caller. It's like an inverse function call. 'x = yield from y', on the other hand, first passes control to y, then the caller, then y, then the caller, an arbitrary number of times, and then finally returns a value from y, not the caller. This is an awful lot of difference in control flow for only a slight change in syntax -- much more of a difference than the difference between yield statements and yield expressions. So at present (for whatever those opinions are worth), I'd say -0 on a yield-from *statement* (somewhat useful but maybe not worth bothering with), +0 on a reference trampoline in the stdlib (slightly better than doing nothing at all, but not by much), and -1 on yield-from expressions and return values (confusing complication with very narrowly focused benefit, reasonably doable with library code).

Greg Ewing

8:11 a.m.

P.J. Eby wrote:

...

(I'm thus finding it hard to believe there's a non-contrived example that's not doing I/O, scheduling, or some other form of co-operative multitasking.)

Have you seen my xml parser example? http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ Whether you'll consider it contrived or not I don't know (contrivedness being such a subjective property) but it illustrates the style of programming I'm trying to support with the return-value feature.

...

In any case, you didn't address the confusion issue: the inability of generators to return a value is there for a good reason,

It's there because formerly there was nowhere for the return value to go. If there is somewhere for it to go, the restriction will no longer be needed. Things like this have happened before. It used to be forbidden to put a yield in a try-finally block, because there was no way to ensure that the finally would be executed. Once a way was found to do that, the restriction was lifted. As for confusion, we ignore the return values of function calls all the time, without worrying that someone might be confused by the fact that their return value doesn't go anywhere. And that's the right way to think of a yield-from expression -- as a kind of function call, not a kind of yield. If there's anything confusing, it's the presence of the word 'yield'. Its only virtue is that it gives a clue that the construct has something to do with generators, but you'll have to RTM to find out exactly what. Nobody has thus far suggested any better name, however. -- Greg

P.J. Eby

5:24 p.m.

At 08:11 PM 3/22/2009 +1200, Greg Ewing wrote:

...

P.J. Eby wrote:

...
(I'm thus finding it hard to believe there's a non-contrived example that's not doing I/O, scheduling, or some other form of co-operative multitasking.)

Have you seen my xml parser example?

http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/

Whether you'll consider it contrived or not I don't know (contrivedness being such a subjective property) but it illustrates the style of programming I'm trying to support with the return-value feature.

I find the parser *without* yield-from to be much easier to follow what's going on, actually... and don't see what benefit was obtained by the additional complication of using send().

...

...
In any case, you didn't address the confusion issue: the inability of generators to return a value is there for a good reason,

It's there because formerly there was nowhere for the return value to go. If there is somewhere for it to go, the restriction will no longer be needed.

But that's begging the question (in the original meaning of the phrase) of why we *want* to have two ways to return data from a generator.

...

As for confusion, we ignore the return values of function calls all the time, without worrying that someone might be confused by the fact that their return value doesn't go anywhere. And that's the right way to think of a yield-from expression -- as a kind of function call, not a kind of yield.

But it's not a function call -- it's multiple *inverted* function calls, followed by special handling of the last iteration of the iterator it takes. The control flow is also hard to explain, as is the implementation.

...

If there's anything confusing, it's the presence of the word 'yield'. Its only virtue is that it gives a clue that the construct has something to do with generators, but you'll have to RTM to find out exactly what. Nobody has thus far suggested any better name, however.

Perhaps this is because it's not that interesting of a feature. As I said, I wouldn't fight a yield-from statement without all this return-value stuff, although it still seems like too much trouble to me.

Terry Reedy

2:01 a.m.

Greg Ewing wrote:

...

As for confusion, we ignore the return values of function calls all the time, without worrying that someone might be confused by the fact that their return value doesn't go anywhere. And that's the right way to think of a yield-from expression -- as a kind of function call, not a kind of yield.

If there's anything confusing, it's the presence of the word 'yield'. Its only virtue is that it gives a clue that the construct has something to do with generators, but you'll have to RTM to find out exactly what. Nobody has thus far suggested any better name, however.

If the yield in 'yield from' does not make the function a generator, then perhaps 'return from' would be clearer.

5812

Age (days ago)

5819

Last active (days ago)

List overview

Download

52 comments

11 participants

participants (11)

Antoine Pitrou
Greg Ewing
Guido van Rossum
Jesse Noller
Michele Simionato
Nick Coghlan
P.J. Eby
Paul Moore
Stefan Rank
Steve Holden
Terry Reedy

PEP 380 (yield from a subgenerator) comments

tags

participants (11)