Revised**7 PEP on Yield-From
I've made another couple of tweaks to the formal semantics (so as not to over-specify when the iterator methods are looked up). Latest version of the PEP, together with the prototype implementation and other related material, is available here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ -- Greg
Hi Greg Greg Ewing wrote:
I've made another couple of tweaks to the formal semantics (so as not to over-specify when the iterator methods are looked up).
Latest version of the PEP, together with the prototype implementation and other related material, is available here:
http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
I am working on my own patch, based on rev2 of yours from the above link and the algorithm I have been going on about. It is currently working, and is even slightly faster than yours in every test I have (much faster in some, that was the whole point). I still need to do a bit of cleanup before I throw it to the wolves though... Anyway, I have a few questions/comments to your patch. 1. There is a small refcounting bug in your gen_iternext function. On success, it returns without decref'ing "yf". 2. In the comment for "gen_undelegate" you mention "certain recursive situations" where a generator may lose its frame before we get a chance to clear f_yieldfrom. Can you elaborate? I can't think of any, and haven't been able to catch any with asserts in a debug-build using my own patch. However, if they exist I will need to handle it somehow and knowing what they are would certainly help. 3. It looks like you are not calling "close" properly from "next", "send" and "throw". This makes no difference when delegating to a generator (the missing close would be a no-op), but would be an issue when delegating to a non-generator. 4. It looks like your "gen_close" does not try to throw a GeneratorExit before calling close when delegating to a non-generator. I think it should to match the description of "close" in PEP342 and the expansion in your PEP. Other than that, great work. It would have taken me ages to figure out all the necessary changes to the grammar, parser, ... and so on by myself. In fact I probably wouldn't even have tried. I hope this helps, and promise to publish my own version of the patch once I think it is fit for public consumption. Best regards - Jacob
Jacob Holm wrote:
1. There is a small refcounting bug in your gen_iternext function. On success, it returns without decref'ing "yf".
Thanks, I'll fix that.
2. In the comment for "gen_undelegate" you mention "certain recursive situations" where a generator may lose its frame before we get a chance to clear f_yieldfrom. Can you elaborate?
I can't remember the details, but I definitely ran into one during development, which is why I added that function. Have you tried running all of my tests?
3. It looks like you are not calling "close" properly from "next", "send" and "throw".
I'm not sure what you mean by that. Can you provide an example that doesn't behave as expected?
4. It looks like your "gen_close" does not try to throw a GeneratorExit before calling close when delegating to a non-generator.
I'm not sure what you mean here either. Regardless of the type of sub-iterator, it should end up getting to the part which does if (!PyErr_Occurred()) PyErr_SetNone(PyExc_GeneratorExit); Again, and example that doesn't behave properly would help. -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
2. In the comment for "gen_undelegate" you mention "certain recursive situations" where a generator may lose its frame before we get a chance to clear f_yieldfrom. Can you elaborate?
I can't remember the details, but I definitely ran into one during development, which is why I added that function. Have you tried running all of my tests? Yup. All tests pass, except for your test19 where my traceback is different.
--- expected/test19.py.out 2009-02-22 09:51:26.000000000 +0100 +++ actual/test19.py.out 2009-03-20 01:50:28.000000000 +0100 @@ -7,8 +7,8 @@ Traceback (most recent call last): File "test19.py", line 20, in <module> for y in gi: - File "test19.py", line 16, in g2 - yield from gi File "test19.py", line 9, in g1 yield from g2() + File "test19.py", line 16, in g2 + yield from gi ValueError: generator already executing
I am not quite sure why that is, but I actually think mine is better.
3. It looks like you are not calling "close" properly from "next", "send" and "throw".
I'm not sure what you mean by that. Can you provide an example that doesn't behave as expected? Sure, see below.
4. It looks like your "gen_close" does not try to throw a GeneratorExit before calling close when delegating to a non-generator.
I'm not sure what you mean here either. Regardless of the type of sub-iterator, it should end up getting to the part which does
if (!PyErr_Occurred()) PyErr_SetNone(PyExc_GeneratorExit);
Again, and example that doesn't behave properly would help.
Of course. Here is a demonstration/test... class iterator(object): """Simple iterator that counts to n while writing what is done to it""" def __init__(self, n): self.ctr = iter(xrange(n)) def __iter__(self): return self def close(self): print "Close" def next(self): print "Next" return self.ctr.next() def send(self, val): print "Send", val return self.ctr.next() def throw(self, *args): print "Throw:", args return self.ctr.next() def generator(n): yield from iterator(n) g = generator(1) g.next() try: g.next() except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(1) g.next() try: g.send(1) except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(1) g.next() try: g.throw(ValueError) except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(2) g.next() try: g.next() except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(2) g.next() try: g.send(1) except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(2) g.next() try: g.throw(ValueError) except Exception, e: print type(e) else: print 'No exception' del g print '--' And here is the output I would expect based on the relevant PEPs. Next Next Close <type 'exceptions.StopIteration'> -- Next Send 1 Close <type 'exceptions.StopIteration'> -- Next Throw: (<type 'exceptions.ValueError'>,) Close <type 'exceptions.StopIteration'> -- Next Next No exception Throw: (<type 'exceptions.GeneratorExit'>,) Close -- Next Send 1 No exception Throw: (<type 'exceptions.GeneratorExit'>,) Close -- Next Throw: (<type 'exceptions.ValueError'>,) No exception Throw: (<type 'exceptions.GeneratorExit'>,) Close -- However, when I run this using your patch, the first 3 "Close" messages, and the 3 "GeneratorExit" messages are missing. Did that help? - Jacob
Sorry about the garbled diff... Here is the real diff between expected and actual output when I run my patch on test19. - Jacob --- expected/test19.py.out 2009-02-22 09:51:26.000000000 +0100 +++ actual/test19.py.out 2009-03-20 02:06:52.000000000 +0100 @@ -7,8 +7,8 @@ Traceback (most recent call last): File "test19.py", line 20, in <module> for y in gi: - File "test19.py", line 16, in g2 - yield from gi File "test19.py", line 9, in g1 yield from g2() + File "test19.py", line 16, in g2 + yield from gi ValueError: generator already executing
Jacob Holm wrote:
Of course. Here is a demonstration/test...
However, when I run this using your patch, the first 3 "Close" messages, and the 3 "GeneratorExit" messages are missing.
I don't understand why you expect to get the output you present. Can you explain your reasoning with reference to the relevant sections of the relevant PEPs that you mention? -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
Of course. Here is a demonstration/test...
However, when I run this using your patch, the first 3 "Close" messages, and the 3 "GeneratorExit" messages are missing.
I don't understand why you expect to get the output you present. Can you explain your reasoning with reference to the relevant sections of the relevant PEPs that you mention?
Starting with "Close". The only reason I excpect *any* "Close" message is that the expansion in your PEP explicitly calls close in the finally clause. It makes no distinction between different ways of exiting the block, so I'd expect one call for each time it is exited. The "GeneratorExit", I expect due to the description of close in PEP 342: def close(self): try: self.throw(GeneratorExit) except (GeneratorExit, StopIteration): pass else: raise RuntimeError("generator ignored GeneratorExit") When the generator is closed (due to the del g lines in the example), this says to throw a GeneratorExit and handle the result. If we do this manually, the throw will be delegated to the iterator, which will print the "Throw: (<type 'exceptions.GeneratorExit'>,)" message. Do I make sense yet? - Jacob
Jacob Holm wrote:
The "GeneratorExit", I expect due to the description of close in PEP 342:
def close(self): try: self.throw(GeneratorExit) except (GeneratorExit, StopIteration): pass else: raise RuntimeError("generator ignored GeneratorExit")
When the generator is closed (due to the del g lines in the example), this says to throw a GeneratorExit and handle the result. If we do this manually, the throw will be delegated to the iterator, which will print the "Throw: (<type 'exceptions.GeneratorExit'>,)" message.
It turns out I was wrong about the GeneratorExit. What I missed is that starting from 2.6, GeneratorExit no longer subclasses Exception, and so it wouldn't be thrown at the iterator. So move along, nothing to see here ... :) - Jacob
Jacob Holm wrote:
The "GeneratorExit", I expect due to the description of close in PEP 342:
def close(self): try: self.throw(GeneratorExit) except (GeneratorExit, StopIteration): pass else: raise RuntimeError("generator ignored GeneratorExit")
Hmmm... well, my PEP kind of supersedes that when a yield-from is in effect, because it specifies that the subiterator is finalized first by attempting to call its 'close' method, not by throwing GeneratorExit into it. After that, GeneratorExit is used to finalize the delegating generator. The reasoning is that GeneratorExit is an implementation detail of generators, not something iterators in general should be expected to deal with. -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
The "GeneratorExit", I expect due to the description of close in PEP 342:
def close(self): try: self.throw(GeneratorExit) except (GeneratorExit, StopIteration): pass else: raise RuntimeError("generator ignored GeneratorExit")
Hmmm... well, my PEP kind of supersedes that when a yield-from is in effect, because it specifies that the subiterator is finalized first by attempting to call its 'close' method, not by throwing GeneratorExit into it. After that, GeneratorExit is used to finalize the delegating generator.
The reasoning is that GeneratorExit is an implementation detail of generators, not something iterators in general should be expected to deal with.
As already mentioned in another mail to this list (maybe you missed it?), the expansion in your PEP actually has the behaviour you expect for the GeneratorExit example because GeneratorExit doesn't inherit from Exception. No need to redefine anything here. Your patch is right, I was wrong, end of story... The other mismatch, concerning the missing "close" calls to the iterator, I still believe to be an issue. It is debatable whether the issue is mostly with the PEP or the implementation, but they don't match up as it is... - Jacob
Jacob Holm wrote:
the expansion in your PEP actually has the behaviour you expect for the GeneratorExit example because GeneratorExit doesn't inherit from Exception.
That's an accident, though, and it's possible I should have specified BaseException there. I still consider the explanation I gave to be the true one.
The other mismatch, concerning the missing "close" calls to the iterator, I still believe to be an issue.
Can you elaborate on that? I thought a first you were expecting the implicit close of the generator that happens before it's deallocated to be passed on to the subiterator, but some of your examples seem to have the close happening *before* the del gen, so I'm confused. -- Greg
Jacob Holm wrote:
the expansion in your PEP actually has the behaviour you expect for the GeneratorExit example because GeneratorExit doesn't inherit from Exception.
That's an accident, though, and it's possible I should have specified BaseException there. I still consider the explanation I gave to be the true one. In that case, I think a clarification in the PEP would be in order. I
Greg Ewing wrote: like the fact that the PEP-342 description of close does the right thing though. If you want BaseException instead of Exception in the PEP, maybe you could replace the: except Exception, _e: line with: except GeneratorExit: raise except BaseException, _e: This would make it clearer that the behavior of close is intentional, and would still allow delegating the throw of any exception not inheriting from GeneratorExit to the subiterator.
The other mismatch, concerning the missing "close" calls to the iterator, I still believe to be an issue.
Can you elaborate on that? I thought a first you were expecting the implicit close of the generator that happens before it's deallocated to be passed on to the subiterator, but some of your examples seem to have the close happening *before* the del gen, so I'm confused.
Yes, I can see that the use of implicit close in that example was a mistake, and that I should have added a few more output lines to clarify the intent. The close is definitely intended to happen before the del in the examples. I have a better example here, with inline comments explaining what I think should happen at critical points (and why): class iterator(object): """Simple iterator that counts to n while writing what is done to it""" def __init__(self, n): self.ctr = iter(xrange(n)) def __iter__(self): return self def close(self): print "Close" def next(self): print "Next" return self.ctr.next() # no send method! # no throw method! def generator(n): try: print "Getting first value from iterator" result = yield from iterator(n) print "Iterator returned", result finally: print "Generator closing" g = generator(1) g.next() try: print "Calling g.next()" # This causes a StopIteration in iterator.next(). After grabbing # the value in the "except StopIteration" clause of the PEP # expansion, the "finally" clause calls iterator.close(). Any # other exception raised by next (or by send or throw if the # iterator had those) would also be handled by the finally # clause. For well-behaved iterators, these calls to close would # be no-ops, but they are still required by the PEP as written. g.next() except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' g = generator(1) g.next() try: print "Calling g.send(42)" # This causes an AttributeError when looking up the "send" method. # The finally clause from the PEP expansion makes sure # iterator.close() is called. This call is *not* expected to be a # no-op. g.send(42) except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' g = generator(1) g.next() try: print "Calling g.throw(ValueError)" # Since iterator does not have a "throw" method, the ValueError is # raised directly in the yield-from expansion in the generator. # The finally clause ensures that iterator.close() is called. # This call is *not* expected to be a no-op. g.throw(ValueError) except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' g = generator(1) g.next() try: print "Calling g.throw(StopIteration(42))" # The iterator still does not have a "throw" method, so the # StopIteration is raised directly in the yield-from expansion. # Then the exception is caught and converted to a value for the # yield-from expression. Before the generator sees the value, the # finally clause makes sure that iterator.close() is called. This # call is *not* expected to be a no-op. g.throw(StopIteration(42)) except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' There is really four examples here. The first one is essentially the same as last time, I just expanded the output a bit. The next two examples are corner cases where the missing close makes a real difference, even for well-behaved iterators (this is not the case in the first example). The fourth example catches a bug in the current version of my patch, and shows a potentially interesting use of an iterator without a send method in a yield-from expression. The issue i have with your patch is that iterator.close() is not called in any of the four examples, even though my reading of the PEP suggests it should be. (I have confirmed that my reading matches the PEP by manually replacing the yield-from in the generator with the expansion from the PEP, just to be sure...) The expected output is: Getting first value from iterator Next Calling g.next() Next Close Iterator returned None Generator closing <type 'exceptions.StopIteration'> -- Getting first value from iterator Next Calling g.send(42) Close Generator closing <type 'exceptions.AttributeError'> -- Getting first value from iterator Next Calling g.throw(ValueError) Close Generator closing <type 'exceptions.ValueError'> -- Getting first value from iterator Next Calling g.throw(StopIteration(42)) Close Iterator returned 42 Generator closing <type 'exceptions.StopIteration'> --
Jacob Holm wrote:
# This causes a StopIteration in iterator.next(). After grabbing # the value in the "except StopIteration" clause of the PEP # expansion, the "finally" clause calls iterator.close().
Okay, I see what you mean now. That's a bug in the expansion. Once an iterator has raised StopIteration, it has presumably already finalized itself, so calling its close() method shouldn't be necessary, and I hadn't intended that it should be called in that case. I'll update the PEP accordingly. -- Greg
I'm thinking about replacing the expansion with the following, which hopefully fixes a couple of concerns that were raised recently without breaking anything else. Can anyone see any remaining ways in which it doesn't match the textual description in the Proposal section? (It still isn't *quite* right, because it doesn't distinguish between a GeneratorExit explicitly thrown in and one resulting from calling close() on the delegating generator. I may need to revise the text and/or my implementation on that point, because I want the inline-expansion interpretation to hold.) _i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except GeneratorExit: _m = getattr(_i, 'close', None) if _m is not None: _m() raise except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r -- Greg
Greg Ewing wrote:
I'm thinking about replacing the expansion with the following, which hopefully fixes a couple of concerns that were raised recently without breaking anything else.
Can anyone see any remaining ways in which it doesn't match the textual description in the Proposal section?
(It still isn't *quite* right, because it doesn't distinguish between a GeneratorExit explicitly thrown in and one resulting from calling close() on the delegating generator. I may need to revise the text and/or my implementation on that point, because I want the inline-expansion interpretation to hold.)
_i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except GeneratorExit: _m = getattr(_i, 'close', None) if _m is not None: _m() raise except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r
I'd adjust the inner exception handlers to exploit the fact that SystemExit and GeneratorExit don't inherit from BaseException: _i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except Exception, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise except: # Covers SystemExit, GeneratorExit and # anything else that doesn't inherit # from Exception _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r I think Antoine and PJE are right that the PEP needs some more actual use cases though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
I'd adjust the inner exception handlers to exploit the fact that SystemExit and GeneratorExit don't inherit from BaseException:
But then anything thrown in that didn't inherit from Exception would bypass giving the subiterator a chance to handle it, which doesn't seem right. The more I think about this, the more I'm wondering whether I shouldn't ever try to call close() on the subiterator at all, and just rely on it to finalize itself when it's deallocated. That would solve all problems concerning when and if close() calls should be made (the answer would be "never"). It would also avoid the problem of a partially exhausted iterator that's still in use by something else getting prematurely finalized, which is another thing that's been bothering me. Here's another expansion based on that idea. When we've finished with the subiterator for whatever reason -- it raised StopIteration, something got thrown in, we got closed ourselves, etc. -- we simply drop our reference to it. If that causes it to be deallocated, it's responsible for cleaning itself up however it sees fit. _i = iter(EXPR) try: try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break finally: del _i RESULT = _r
I think Antoine and PJE are right that the PEP needs some more actual use cases though.
The examples I have are a bit big to put in the PEP itself, but I can include some links. -- Greg
Nick Coghlan wrote:
I'd adjust the inner exception handlers to exploit the fact that SystemExit and GeneratorExit don't inherit from BaseException:
[...] except: # Covers SystemExit, GeneratorExit and # anything else that doesn't inherit # from Exception _m = getattr(_i, 'close', None) if _m is not None: _m() raise
This feels better to me too. Though it seems that _i.throw would be more appropriate than _i.close (except call _i.close is there is no _i.throw -- is it possible to have a close and not a throw?). I like the idea that "finally" (in try/finally) means finally and not "maybe finally" (which boils down to finally in CPython due to the reference counting collector, but maybe finally in Jython, IronPython or Pypy). -bruce frederiksen
Greg Ewing wrote:
Jacob Holm wrote:
# This causes a StopIteration in iterator.next(). After grabbing # the value in the "except StopIteration" clause of the PEP # expansion, the "finally" clause calls iterator.close().
Okay, I see what you mean now. That's a bug in the expansion. Once an iterator has raised StopIteration, it has presumably already finalized itself, so calling its close() method shouldn't be necessary, and I hadn't intended that it should be called in that case.
close() *should* still be called in that case - the current expansion in the PEP is correct. It is the *iterator's* job to make sure that multiple calls to close() (or calling close() on a finished iterator) don't cause problems. The syntax shouldn't be trying to second guess whether or not calling close() is necessary or not - it should just be calling it, period.
def gen(): ... yield 1 ... g = gen() g.next() 1 g.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration g.close() g.close() g2 = gen() g.close() g.close() g3 = gen() g3.next() 1 g.close() g.close()
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
The syntax shouldn't be trying to second guess whether or not calling close() is necessary or not - it should just be calling it, period.
But *why* should it be called? Just as calling close() after the iterator has finished shouldn't do any harm, *not* doing so shouldn't do any harm either, and some implementation strategies (my current one included) would have to go out of their way to call close() in that case. -- Greg
I'm having trouble making up my mind how GeneratorExit should be handled. My feeling is that GeneratorExit is a peculiarity of generators that other kinds of iterators shouldn't have to know about. So, if you close() a generator, that shouldn't imply throwing GeneratorExit into the subiterator -- rather, the subiterator should simply be dropped and then the delegating generator finalized as usual. If the subiterator happens to be another generator, dropping the last reference to it will cause it to be closed, in which case it will raise its own GeneratorExit. Other kinds of iterators can finalize themselves however they see fit, and don't need to pretend they're generators and understand GeneratorExit. For consistency, this implies that a GeneratorExit explicitly thrown in using throw() shouldn't be forwarded to the subiterator either, even if it has a throw() method. To do otherwise would require making a distinction that can't be expressed in the Python expansion. Also, it seems elegant to preserve the property that if g is a generator then g.close() and g.throw(GeneratorExit) are exactly equivalent. What do people think about this? -- Greg
Greg Ewing wrote:
To do otherwise would require making a distinction that can't be expressed in the Python expansion. Also, it seems elegant to preserve the property that if g is a generator then g.close() and g.throw(GeneratorExit) are exactly equivalent.
What do people think about this?
That whole question is why I suggested rephrasing the question of which exceptions are passed to the subiterator in Exception vs BaseException terms. The only acknowledged direct subclasses of BaseException are KeyboardInterrupt, SystemExit and GeneratorExit. The purpose of those exceptions is to say "drop what you're doing and bail out any which way you can". Terminating the outermost generator in those cases and letting the subiterators clean up as best they can sounds like a perfectly reasonable option to me. The alternative is to catch BaseException and throw absolutely everything (including GeneratorExit) into the subiterator. The in-between options that you're describing would appear to just complicate the semantics to no great purpose. Note that you may also be pursuing a false consistency here, since g.close() has never been equivalent to g.throw(GeneratorExit), as the latter propagates the exception back into the current scope while the former suppresses it (example was run using 2.5.2):
def gen(): yield ... g = gen() g.next() g.close() g2 = gen() g2.next() g2.throw(GeneratorExit) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in gen GeneratorExit
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Greg Ewing wrote:
To do otherwise would require making a distinction that can't be expressed in the Python expansion. Also, it seems elegant to preserve the property that if g is a generator then g.close() and g.throw(GeneratorExit) are exactly equivalent.
What do people think about this?
That whole question is why I suggested rephrasing the question of which exceptions are passed to the subiterator in Exception vs BaseException terms. The only acknowledged direct subclasses of BaseException are KeyboardInterrupt, SystemExit and GeneratorExit. The purpose of those exceptions is to say "drop what you're doing and bail out any which way you can". Terminating the outermost generator in those cases and letting the subiterators clean up as best they can sounds like a perfectly reasonable option to me. The alternative is to catch BaseException and throw absolutely everything (including GeneratorExit) into the subiterator. The in-between options that you're describing would appear to just complicate the semantics to no great purpose.
Well, since GeneratorExit is specifically about generators, I don't see a problem in special-casing that one and just let everything else be thrown at the subgenerator. I would also be Ok with just throwing everything (including GeneratorExit) there, as that makes the implementation of throw a bit simpler.
Note that you may also be pursuing a false consistency here, since g.close() has never been equivalent to g.throw(GeneratorExit), as the latter propagates the exception back into the current scope while the former suppresses it (example was run using 2.5.2):
I believe that the "exact equivalence" Greg was talking about is the description of close from PEP 342. It is nice that the semantics of close can be described so easily in terms of throw. I like the idea of not having an explicit close in the expansion at all. In most cases the refcounting will take care of it anyway (at least in CPython), and when there are multiple references you might actually want to not close. Code that needs it can add the explicit close themselves by putting the yield-from in a try...finally or a with... block. - Jacob
Greg Ewing wrote:
I'm having trouble making up my mind how GeneratorExit should be handled.
My feeling is that GeneratorExit is a peculiarity of generators that other kinds of iterators shouldn't have to know about. They don't, see below.
So, if you close() a generator, that shouldn't imply throwing GeneratorExit into the subiterator -- rather, the subiterator should simply be dropped and then the delegating generator finalized as usual.
If the subiterator happens to be another generator, dropping the last reference to it will cause it to be closed, in which case it will raise its own GeneratorExit. This is only true in CPython, but that shouldn't be a problem. If you really need the subiterator to be closed at that point, wrapping the yield-from in the appropriate try...finally... or with... block will do the trick.
Other kinds of iterators can finalize themselves however they see fit, and don't need to pretend they're generators and understand GeneratorExit. They don't have to understand GeneratorExit at all. As long as they know how to clean up after themselves when thrown an exception they cannot handle, things will just work. GeneratorExit is no different from SystemExit or KeyboardInterrupt in that regard.
For consistency, this implies that a GeneratorExit explicitly thrown in using throw() shouldn't be forwarded to the subiterator either, even if it has a throw() method.
I agree that if close() doesn't throw the GeneratorExit to the subiterator, then throw() shouldn't either.
To do otherwise would require making a distinction that can't be expressed in the Python expansion. Also, it seems elegant to preserve the property that if g is a generator then g.close() and g.throw(GeneratorExit) are exactly equivalent.
Not exactly equivalent, but related in the simple way described in PEP 342.
What do people think about this?
If I understand you correctly, what you want can be described by the following expansion: _i = iter(EXPR) try: _u = _i.next() while 1: try: _v = yield _u except GeneratorExit: raise except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: RESULT = _e.value finally: _i = _u = _v = _e = _m = None del _i, _u, _v, _e, _m (except for minor details like the possible method caching). I like this version because it makes it easier to share subiterators if you need to. The explicit close in the earlier proposals meant that as soon as one generator delegating to the shared iterator was closed, the shared one would be as well. No, I don't have a concrete use case for this, but I think it is the least surprising behavior we could choose for closing shared subiterators. As mentioned above, you can still explicitly request that the subiterator be closed with the delegating generator by wrapping the yield-from in a try...finally... or with... block. If I understand Nick correctly, he would like to drop the "except GeneratorExit: raise" part, and possibly change BaseException to Exception. I don't like the idea of just dropping the "except GeneratorExit: raise", as that brings us back in the situation where shared subiterators are less useful. If we also change BaseException to Exception, the only difference is that it will no longer be possible to throw exceptions like SystemExit and KeyboardInterrupt that don't inherit from Exception to a subiterator. Again, I don't have a concrete use case, but I think putting an arbitrary restriction like that in a language construct is a bad idea. One example where this would cause surprises is if you split part of a generator function (that for one reason or another need to handle these exceptions) into a separate generator and calls it using yield from. Throwing an exception to the refactored generator could then have different meaning than before the refactoring, and there would be no easy way to fix this. Just my 2 cents... - Jacob
Jacob Holm wrote:
If I understand Nick correctly, he would like to drop the "except GeneratorExit: raise" part, and possibly change BaseException to Exception. I don't like the idea of just dropping the "except GeneratorExit: raise", as that brings us back in the situation where shared subiterators are less useful. If we also change BaseException to Exception, the only difference is that it will no longer be possible to throw exceptions like SystemExit and KeyboardInterrupt that don't inherit from Exception to a subiterator.
Note that as of 2.6, GeneratorExit doesn't inherit from Exception either - it now inherits directly from BaseException, just like the other two terminal exceptions: Python 2.6+ (trunk:66863M, Oct 9 2008, 21:32:59)
BaseException.__subclasses__() [<type 'exceptions.Exception'>, <type 'exceptions.GeneratorExit'>, <type 'exceptions.SystemExit'>, <type 'exceptions.KeyboardInterrupt'>]
All I'm saying is that if GeneratorExit doesn't get passed down then neither should SystemExit nor KeyboardInterrupt, while if the latter two *do* get passed down, then so should GeneratorExit. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Hi Nick Nick Coghlan wrote:
Jacob Holm wrote:
If I understand Nick correctly, he would like to drop the "except GeneratorExit: raise" part, and possibly change BaseException to Exception. I don't like the idea of just dropping the "except GeneratorExit: raise", as that brings us back in the situation where shared subiterators are less useful. If we also change BaseException to Exception, the only difference is that it will no longer be possible to throw exceptions like SystemExit and KeyboardInterrupt that don't inherit from Exception to a subiterator.
Note that as of 2.6, GeneratorExit doesn't inherit from Exception either - it now inherits directly from BaseException, just like the other two terminal exceptions:
I know this.
All I'm saying is that if GeneratorExit doesn't get passed down then neither should SystemExit nor KeyboardInterrupt, while if the latter two *do* get passed down, then so should GeneratorExit.
I also know this, and I disagree. You are saying that because they have the thing in commen that they do *not* inherit from Exception we should treat them the same. This is like saying that anything that is not a shade of green should be treated as red, completely ignoring the possibility of other colors. I like to see GeneratorExit handled as a special case by yield-from, because: 1. It already has a special meaning in generators as the exception raised in the generator when close is called. 2. It *enables* certain uses of yield-from that would require much more more work to handle otherwise. I am thinking of the ability to have multiple generators yield from the same iterator. Being able to close one generator without closing the shared iterator seems like a good thing. 3. While the GeneratorExit is not propagated directly, its expected effect of finalizing the subiterator *is*. At least in CPython, and assuming the subiterator does its finalization in a __del__ method, and that the generator holds the only reference. If the subiterator is actually a generator, it will even look like the GeneratorExit was propagated, due to the PEP 342 definition of close. I don't like the idea of only throwing exceptions that inherit from Exception to the subiterator, because it makes the following two generators behave differently when thrown a non-Exception exception. def generatorA(): try: x = yield except BaseException, e: print type(e) raise def generatorB(): return (yield from generatorA()) The PEP is clearly intended to make them act identically. Quoting from the PEP: "When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the yield from expression". Treating only GeneratorExit special allows them to behave exactly the same (in CPython). If you only propagate exceptions that inherit from Exception, you would have to write something like: def generatorC(): g = generatorA() while 1: try: return (yield from g) except Exception: # This exception comes from g, so just reraise raise except BaseException, e: yield g.throw(e) # this exception was not propagated by yield-from, do it manually to get the same effect. I don't mind that the expansion as written in the PEP becomes very slightly more complicated, as long as it makes the code using it simpler to reason about. - Jacob
Nick Coghlan wrote:
All I'm saying is that if GeneratorExit doesn't get passed down then neither should SystemExit nor KeyboardInterrupt
That would violate the inlining principle, though. An inlined generator is going to get all exceptions regardless of what they inherit from.
, while if the latter two *do* get passed down, then so should GeneratorExit.
Whereas that would mean a shared subiterator would get prematurely finalized when closing the delegating generator. So there seems to be no choice about this -- we must pass on all exceptions except GeneratorExit, and we must *not* pass on GeneratorExit itself. -- Greg
Greg Ewing wrote:
My feeling is that GeneratorExit is a peculiarity of generators that other kinds of iterators shouldn't have to know about. So, if you close() a generator, that shouldn't imply throwing GeneratorExit into the subiterator -- [...]
It can only be "thrown into the subiterator" if the subiterator is a generator (i.e., has a throw method) -- in which case, it knows about GeneratorExit. So the hasattr(_i, 'throw') test already covers this case.
If the subiterator happens to be another generator, dropping the last reference to it will cause it to be closed, [...]
NO, NO, NO. Unless you are prepared to say that programs written to this spec are *not* expected to run on any other version of Python other than CPython. CPython is the *only* version with a reference counting collector. And encouraging Python programmers to rely on this invites trouble when they try to port to any other version of Python. I know. I've been there, and have the T-shirt. And it's not pretty. The errors that you get when your finally clauses and context managers aren't run can be quite mysterious. And God help that person if they haven't slept with PEP 342 under their pillow!
Other kinds of iterators can finalize themselves however they see fit, and don't need to pretend they're generators and understand GeneratorExit. Your PEP currently does not demand that other iterators "pretend they're generators and understand GeneratorExit". Non-generator iterators don't have throw or close methods and will remain blissfully ignorant of these finer points as the PEP stands now. So this is not a problem.
For consistency, this implies that a GeneratorExit explicitly thrown in using throw() shouldn't be forwarded to the subiterator either, even if it has a throw() method.
To do otherwise would require making a distinction that can't be expressed in the Python expansion. Also, it seems elegant to preserve the property that if g is a generator then g.close() and g.throw(GeneratorExit) are exactly equivalent. Yes, g.close and g.throw(GeneratorExit) are equivalent. So you should be able to translate a close into a throwing GeneratorExit or vice versa. But if the subiterator doesn't have the first method that you look for (let's say you pick throw), then you should call the other method (if it has that one instead).
Finally, on your previous post, you say:
It would also avoid the problem of a partially exhausted iterator that's still in use by something else getting prematurely finalized, which is another thing that's been bothering me. This is a valid point. But consider:
1. The delegating generator has no way to stop the subgenerator prematurely when it uses the yield from. So the yield from can only be stopped prematurely by the delegating generator's caller. And then the subgenerator would have to be communicated between the caller to the delegating generator somehow (e.g, passed in as a parameter) so that the caller could continue to use it. (And the subgenerator has to be a generator, not a plain iterator). Though possible, this kind of a use case would be used very rarely compared to the use case of the yield from being the final place the subgenerator is used. 2. If finalization of the subgenerator needs to be prevented, it can be wrapped in a plain iterator wrapper that doesn't define throw or close. class no_finalize: def __init__(self, gen): self.gen = gen def __iter__(self): return self def __next__(self): return next(self.gen) def send(self, x): return self.gen.send(x) g = subgen(...) yield from no_finalize(g) ... use g As I see it, you are faced with two options: 1. Define "yield from" in a way that it will work the same in all implementations of Python and will work for the 98% use case without any extra boilerplate code, and only require extra boilerplate (as above) for the 2% use case. or 2. Define "yield from" in a way that will have quite different behavior (for reasons very obscure to most programmers) on the different implementations of Python (due to the different implementation of garbage collectors), require boilerplate code to be portable for the 98% use case (e.g., adding a "with closing(subgen())" around the yield from); but not require any boilerplate code for portability in the 2% use case. The only argument I can think in favor of option 2, is that's what the "for" statement ended up with. But that was only because changing the "for" statement to option 1 would break the legacy 2% use cases... IMHO option 1 is the better choice. -bruce frederiksen
Bruce Frederiksen wrote:
If the subiterator happens to be another generator, dropping the last reference to it will cause it to be closed, [...] NO, NO, NO. Unless you are prepared to say that programs written to
Greg Ewing wrote: [...] this spec are *not* expected to run on any other version of Python other than CPython. CPython is the *only* version with a reference counting collector. And encouraging Python programmers to rely on this invites trouble when they try to port to any other version of Python. I know. I've been there, and have the T-shirt. And it's not pretty. The errors that you get when your finally clauses and context managers aren't run can be quite mysterious. And God help that person if they haven't slept with PEP 342 under their pillow! Ok, got it. Relying on refcounting is bad.
[...]
It would also avoid the problem of a partially exhausted iterator that's still in use by something else getting prematurely finalized, which is another thing that's been bothering me. This is a valid point. But consider:
1. The delegating generator has no way to stop the subgenerator prematurely when it uses the yield from. So the yield from can only be stopped prematurely by the delegating generator's caller. And then the subgenerator would have to be communicated between the caller to the delegating generator somehow (e.g, passed in as a parameter) so that the caller could continue to use it. (And the subgenerator has to be a generator, not a plain iterator). "...subgenerator has to be a generator" is not entirely true. For example, if the subiterator doesn't have send, you can send a non-None value to the generator and that will raise an AttributeError at the yield from. If it doesn't have throw, you can even throw a StopIteration with a value to get that value as the result of the yield-from expression, which might be useful in a twisted sort of way. In both cases, the subiterator will only be closed if the yield-from expression actually closes it. So it is definitely possible to get a non-generator prematurely finalized.
Though possible, this kind of a use case would be used very rarely compared to the use case of the yield from being the final place the subgenerator is used. That I agree with.
2. If finalization of the subgenerator needs to be prevented, it can be wrapped in a plain iterator wrapper that doesn't define throw or close.
class no_finalize: def __init__(self, gen): self.gen = gen def __iter__(self): return self def __next__(self): return next(self.gen) def send(self, x): return self.gen.send(x)
g = subgen(...) yield from no_finalize(g) ... use g
Well, if the subiterator is a generator that itself uses yield-from, the need to wrap it would destroy all possible speed benefits of using yield-from. So if there *is* a valid use case for yielding from a shared generator, this is not really a solution unless you don't care about speed.
As I see it, you are faced with two options:
1. Define "yield from" in a way that it will work the same in all implementations of Python and will work for the 98% use case without any extra boilerplate code, and only require extra boilerplate (as above) for the 2% use case. or
I can live with that. This essentially means using the expansion in the PEP (with "except Exception, _e" replaced by "except BaseException, _e", to get the inlining property we all want). The decision to use explicit close will make what could have been a 2% use case much less attractive. Note that with explicit close, my argument for special-casing GeneratorExit by adding "except GeneratorExit: raise" weakens. The GeneratorExit will be delegated to the deepest generator/iterator with a throw method. As long as the iterators don't swallow the exception, they will be closed from the finally clause in the expansion. If one of them *does* swallow the exception, the outermost generator will raise a RuntimeError. The only difference that special-casing GeneratorExit would make is that 1) if the final iterator is not a generator, it won't see a GeneratorExit, and 2) if one of the iterators swallow the exception, the rest would still be closed and you might get a better traceback for the RuntimeError.
2. Define "yield from" in a way that will have quite different behavior (for reasons very obscure to most programmers) on the different implementations of Python (due to the different implementation of garbage collectors), require boilerplate code to be portable for the 98% use case (e.g., adding a "with closing(subgen())" around the yield from); but not require any boilerplate code for portability in the 2% use case.
The only argument I can think in favor of option 2, is that's what the "for" statement ended up with. But that was only because changing the "for" statement to option 1 would break the legacy 2% use cases...
There is also the question of speed as mentioned above, but that argument is not all that strong...
IMHO option 1 is the better choice.
If relying on refcounting is as bad as you say, then I agree. - Jacob
Jacob Holm wrote:
This is a valid point. But consider:
1. The delegating generator has no way to stop the subgenerator prematurely when it uses the yield from. So the yield from can only be stopped prematurely by the delegating generator's caller. And then the subgenerator would have to be communicated between the caller to the delegating generator somehow (e.g, passed in as a parameter) so that the caller could continue to use it. (And the subgenerator has to be a generator, not a plain iterator). "...subgenerator has to be a generator" is not entirely true. For example, if the subiterator doesn't have send, you can send a non-None value to the generator and that will raise an AttributeError at the yield from. If it doesn't have throw, you can even throw a StopIteration with a value to get that value as the result of the yield-from expression, which might be useful in a twisted sort of way. In both cases, the subiterator will only be closed if the yield-from expression actually closes it. So it is definitely possible to get a non-generator prematurely finalized. But non-generators don't have a close (or throw) method. They lack the concept of "finalization". Only generators have these extra methods. So using a subiterator in yield from isn't an issue here. (Or am I missing something)? Well, if the subiterator is a generator that itself uses yield-from,
Bruce Frederiksen wrote: the need to wrap it would destroy all possible speed benefits of using yield-from. So if there *is* a valid use case for yielding from a shared generator, this is not really a solution unless you don't care about speed. Yes, there is a performance penalty in this case. If the wrapper were written in C, then I would think that the penalty would be negligible. Perhaps offer a C wrapper in a standard library?? Note that with explicit close, my argument for special-casing GeneratorExit by adding "except GeneratorExit: raise" weakens. The GeneratorExit will be delegated to the deepest generator/iterator with a throw method. As long as the iterators don't swallow the exception, they will be closed from the finally clause in the expansion. If one of them *does* swallow the exception, the outermost generator will raise a RuntimeError. Another case where close differs from throw(GeneratorExit). Close is define in PEP 342 to raise RuntimeError if GeneratorExit is swallowed. Should the delegating generator, then, be calling close rather throw for GeneratorExit so that the RuntimeError is raised closer to cause of the exception? Or does this violate the "inlining" goal of the current PEP?
-bruce frederiksen
Bruce Frederiksen wrote:
But non-generators don't have a close (or throw) method. They lack the concept of "finalization".
Any object could require explicit finalization in the absence of refcounting, so "close" isn't peculiar to generators.
Should the delegating generator, then, be calling close rather throw for GeneratorExit so that the RuntimeError is raised closer to cause of the exception? Or does this violate the "inlining" goal of the current PEP?
Yes, it would violate the inlining principle. -- Greg
We have a decision to make. It appears we can have *one* of the following, but not both: (1) In non-refcounting implementations, subiterators are finalized promptly when the delegating generator is explicitly closed. (2) Subiterators are not prematurely finalized when other references to them exist. Since in the majority of intended use cases the subiterator won't be shared, (1) seems like the more important guarantee to uphold. Does anyone disagree with that? Guido, what do you think? -- Greg
Greg Ewing wrote:
We have a decision to make. It appears we can have *one* of the following, but not both:
(1) In non-refcounting implementations, subiterators are finalized promptly when the delegating generator is explicitly closed.
(2) Subiterators are not prematurely finalized when other references to them exist.
Since in the majority of intended use cases the subiterator won't be shared, (1) seems like the more important guarantee to uphold. Does anyone disagree with that?
If you choose (2), then (1) is trivial to implement in code that uses the new expression in combination with existing support for deterministic finalisation. For example: with contextlib.closing(make_subiter()) as subiter: yield from subiter On the other hand, if you choose (1), then it is impossible to use that construct in combination with any other existing constructs to avoid finalisation - you have to write out the equivalent code from the PEP by hand, leaving out the finalisation parts. So I think dropping the implicit finalisation is the better option - it simplifies the new construct, and plays well with explicit finalisation when that is what people want. However, I would also recommend *not* special casing GeneratorExit in that case: just pass it down using throw. Note that non-generator iterators that want "throw" to mean the same thing as "close" can do that easily enough: def throw(self, *args): self.close() reraise(*args) (reraise itself would just do the dance to check how many arguments there were and use the appropriate form of "raise" to reraise the exception) Hmm, that does suggest another issue with the PEP however: it only calls the subiterator's throw with the value of the thrown in exception. It should be using the 3 argument form to avoid losing any passed in traceback information. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Greg Ewing wrote:
(1) In non-refcounting implementations, subiterators are finalized promptly when the delegating generator is explicitly closed.
(2) Subiterators are not prematurely finalized when other references to them exist.
If you choose (2), then (1) is trivial to implement
with contextlib.closing(make_subiter()) as subiter: yield from subiter
That's a fairly horrendous thing to expect people to write around all their yield-froms, though. It also means we would have to say that the inlining principle only holds for refcounting implementations. Maybe we should just give up trying to accommodate shared subiterators. Is it worth complicating everything for the sake of something that's not really part of the intended set of use cases?
Hmm, that does suggest another issue with the PEP however: it only calls the subiterator's throw with the value of the thrown in exception. It should be using the 3 argument form to avoid losing any passed in traceback information.
Good point, I'll update the expansion accordingly. -- Greg
Greg Ewing wrote:
Nick Coghlan wrote:
Greg Ewing wrote:
(1) In non-refcounting implementations, subiterators are finalized promptly when the delegating generator is explicitly closed.
(2) Subiterators are not prematurely finalized when other references to them exist.
If you choose (2), then (1) is trivial to implement
with contextlib.closing(make_subiter()) as subiter: yield from subiter
That's a fairly horrendous thing to expect people to write around all their yield-froms, though. It also means we would have to say that the inlining principle only holds for refcounting implementations.
Maybe we should just give up trying to accommodate shared subiterators. Is it worth complicating everything for the sake of something that's not really part of the intended set of use cases?
Consider what happens if you replace the 'yield from' with the basic form of iterator delegation that exists now: for x in make_subiter(): yield x Is such code wrong in any way? No it isn't. Failing to finalise the object of iteration is the *normal* case. If for some reason it is important in a given application to finalise it properly (e.g. the subiter opens a database connection or file and we want to ensure they are closed promptly no matter what else happens), only *then* does deterministic finalisation come into play: with closing(make_subiter()) as subiter: for x in subiter: yield x That is, I now believe the 'normal' case for 'yield from' should be modelled on basic iteration, which means no implicit finalisation. Now, keep in mind that in parallel with this I am now saying that *all* exceptions, *including GeneratorExit* should be passed down to the subiterator if it has a throw() method. So even without implicit finalisation you can use "yield from" to nest generators to your heart's content and an explicit close on the outermost generator will be passed down to the innermost generator and unwind the generator stack from there. Using your "no finally clause" version from earlier in this thread as the base for the exact semantic description: _i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r With an expansion of that form, you can easily make arbitrary iterators (including generators) shareable by wrapping them in an iterator with no throw or send methods: class ShareableIterator(object): def __init__(self, itr): self.itr = itr def __iter__(self): return self def __next__(self): return self.itr.next() next = __next__ # Be 2.x friendly def close(self): # Still support explicit finalisation of the # shared iterator, just not throw() or send() try: close_itr = self.itr.close except AttributeError: pass else: close_itr() # Decorator to use the above on a generator function def shareable(g): @functools.wraps(g) def wrapper(*args, **kwds): return ShareableIterator(g(*args, **kwds)) return wrapper Iterators that need finalisation can either make themselves implicitly closable in yield from expressions by defining a throw() method that delegates to close() and then reraises the exception appropriately, or else they can recommend explicit closure regardless of the means of iteration (be it a for loop, a generator expression or container comprehension, manual iteration or the new yield from expression). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
[snip arguments for modelling on basic iteration] That is, I now believe the 'normal' case for 'yield from' should be modelled on basic iteration, which means no implicit finalisation.
Now, keep in mind that in parallel with this I am now saying that *all* exceptions, *including GeneratorExit* should be passed down to the subiterator if it has a throw() method.
I still think that is less useful than catching it and just dropping the reference, see below.
So even without implicit finalisation you can use "yield from" to nest generators to your heart's content and an explicit close on the outermost generator will be passed down to the innermost generator and unwind the generator stack from there.
The same would happen with the *implicit* close caused by the last reference to the outermost generator going away. Delegating the GeneratorExit is a sure way to premature finalization when using shared generators, but only in a refcounting implementation like C-Python. That makes this the only feature I know of that would be *more* useful in a non-refcounting implementation.
Using your "no finally clause" version from earlier in this thread as the base for the exact semantic description:
_i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r
I know I didn't comment on that expansion earlier, but should have. It fails to handle the case where the throw raises a StopIteration (or there is no throw method and the thrown exception is a StopIteration). You need something like: _i = iter(EXPR) try: _u = _i.next() while 1: try: _v = yield _u # except GeneratorExit: # raise except BaseException: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(*sys.exc_info()) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: RESULT = _e.value finally: _i = _u = _v = _e = _m = None del _i, _u, _v, _e, _m This is independent of the GeneratorExit issue, but I put it in there as a comment just to make it clear what *I* think it should be if we are not putting a close in the finally clause. If we *do* put a call to close in the finally clause, the premature finalization of shared generators is guaranteed anyway, so there is not much point in specialcasing GeneratorExit.
With an expansion of that form, you can easily make arbitrary iterators (including generators) shareable by wrapping them in an iterator with no throw or send methods:
class ShareableIterator(object): def __init__(self, itr): self.itr = itr def __iter__(self): return self def __next__(self): return self.itr.next() next = __next__ # Be 2.x friendly def close(self): # Still support explicit finalisation of the # shared iterator, just not throw() or send() try: close_itr = self.itr.close except AttributeError: pass else: close_itr()
# Decorator to use the above on a generator function def shareable(g): @functools.wraps(g) def wrapper(*args, **kwds): return ShareableIterator(g(*args, **kwds)) return wrapper
With this wrapper, you will not be able to throw *any* exceptions to the shared iterator. Even if you fix the wrapper to pass through all other exceptions than GeneratorExit, you will still completely lose the speed benefits of yield-from when doing so. (For next, send, and throw it is possible to completely bypass all the intervening generators, so the call overhead becomes independent of the number of generators in the yield-from chain. I have a patch that does exactly this, working except for details related to this discussion). It is not possible to write such a wrapper efficiently without making it a builtin and special-casing it in the yield-from implementation, and I don't think that is a good idea.
Iterators that need finalisation can either make themselves implicitly closable in yield from expressions by defining a throw() method that delegates to close() and then reraises the exception appropriately, or else they can recommend explicit closure regardless of the means of iteration (be it a for loop, a generator expression or container comprehension, manual iteration or the new yield from expression).
A generator or iterator that needs closing should recommend explicit closing *anyway* to work correctly in other contexts on platforms other than C-Python. Not delegating GeneratorExit just happens to make it much simpler and faster to use shared generators/iterators that *don't* need immediate finalization. In C-Python you even get the finalization for free due to the refcounting, but of course relying on that is generally considered a bad idea. - Jacob
Jacob Holm wrote:
It fails to handle the case where the throw raises a StopIteration (or there is no throw method and the thrown exception is a StopIteration).
No, I think it does the right thing in that case. By the inlining principle, the StopIteration should be thrown in like anything else, and if it propagates back out, it should stop the delegating generator, *not* the subiterator. -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
It fails to handle the case where the throw raises a StopIteration (or there is no throw method and the thrown exception is a StopIteration).
No, I think it does the right thing in that case. By the inlining principle, the StopIteration should be thrown in like anything else, and if it propagates back out, it should stop the delegating generator, *not* the subiterator.
But if you throw another exception and it is converted to a StopIteration by the subiterator, this should definitely stop the subiterator and get a return value. Or? - Jacob
Jacob Holm wrote:
But if you throw another exception and it is converted to a StopIteration by the subiterator, this should definitely stop the subiterator and get a return value.
Not if it simply raises a StopIteration from the throw call. It would have to mark itself as completed, return normally from the throw and then raise StopIteration on the next call to next() or send(). -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
But if you throw another exception and it is converted to a StopIteration by the subiterator, this should definitely stop the subiterator and get a return value.
Not if it simply raises a StopIteration from the throw call. It would have to mark itself as completed, return normally from the throw and then raise StopIteration on the next call to next() or send().
One of us must be missing something... If the subiterator is exhausted before the throw, there won't *be* a value to return from the call so the only options for the throw method are to raise StopIteraton, or to raise some other exception. Example: def inner(): try: yield 1 except ValueError: pass return 2 def outer(): v = yield from inner() yield v g = outer() print g.next() # prints 1 print g.throw(ValueError) # prints 2 In your expansion, the StopIteration raised by inner escapes the outer generator as well, so we get a StopIteration instead of the second print that I would expect. Can you explain in a little more detail how the inlining argument makes you want to not catch a StopIteration escaping from throw? - Jacob
Jacob Holm wrote:
Greg Ewing wrote:
Jacob Holm wrote:
But if you throw another exception and it is converted to a StopIteration by the subiterator, this should definitely stop the subiterator and get a return value.
Not if it simply raises a StopIteration from the throw call. It would have to mark itself as completed, return normally from the throw and then raise StopIteration on the next call to next() or send().
One of us must be missing something... If the subiterator is exhausted before the throw, there won't *be* a value to return from the call so the only options for the throw method are to raise StopIteraton, or to raise some other exception.
I agree with Jacob here - contextlib.contextmanager contains a similar check in its __exit__ method. The thing to check for is the throw method call raising StopIteration and that StopIteration instance being a *different* exception from the one that was thrown in. (This matters more in the contextmanager case, since it is quite legitimate for a generator to finish and raise StopIteration from inside a with statement, so the contextmanager needs to avoid accidentally suppressing that exception). Avoiding the problem of suppressing thrown in StopIteration instances means we still need multiple inner try/except blocks rather than a large outer one. There is also another special case to consider: since a permitted response to "throw(GeneratorExit)" is for the iterator to just terminate instead of reraising GeneratorExit, the thrown in exception should be reraised unconditionally in that situation. So the semantics would then become: _i = iter(EXPR) try: _u = _i.next() except StopIteration as _e: _r = _e.value else: while 1: try: _v = yield _u except: _m = getattr(_i, 'throw', None) if _m is not None: _et, _ev, _tb = sys.exc_info() try: _u = _m(_et, _ev, _tb) except StopIteration as _e: if _e is _ev or _et is GeneratorExit: # Don't suppress a thrown in # StopIteration and handle the # case where a subiterator # handles GeneratorExit by # terminating rather than # reraising the exception raise # The thrown in exception # terminated the iterator # gracefully _r = _e.value else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration as _e: _r = _e.value break RESULT = _r -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Jacob Holm wrote:
Can you explain in a little more detail how the inlining argument makes you want to not catch a StopIteration escaping from throw?
It's easier to see if we use an example that doesn't involve a return value, since it's clearer what "inlining" means in that case. def inner(): try: yield 1 except ValueError: pass def outer(): print "About to yield from inner" yield from inner() print "Finished yielding from inner" Now if we inline that, we get: def outer_and_inner(): print "About to yield from inner" try: yield 1 except ValueError: pass print "Finished yielding from inner" What would you expect that to do if you throw StopIteration into it while it's suspended at the yield? However, thinking about the return value case has made me realize that it's not so obvious what "inlining" means then. To get the return value in your example, one way would be to perform the inlining like this: def outer(): try: try: yield 1 except ValueError: pass raise StopIteration(2) except StopIteration, e: v = e.value yield v which results in the behaviour you are expecting. However, if you were inlining an ordinary function, that's not how you would handle a return value -- rather, you'd just replace the return by a statement that assigns the return value to wherever it needs to go. Using that strategy, we get def outer(): try: yield 1 except ValueError: pass v = 2 yield v That's closer to what I have in mind when I talk about "inlining" in the PEP. I realize that this is probably not exactly what the current expansion specifies. I'm working on a new one to fix issues like this. -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
Can you explain in a little more detail how the inlining argument makes you want to not catch a StopIteration escaping from throw? [snip explanation] Thank you very much for the clear explanation. It seems each of us were missing something. AFAICT your latest expansion (reproduced below) fixes this.
I have a few (final, I hope) nits to pick about the finally clause. To start with there is no need for a separate "try". Just adding the finally clause to the next try..except..else has the exact same semantics. Then there is the contents of the finally clause. It is either too much or too little, depending on what it is you are trying to specify. If the intent is to show that the last reference from the expansion to _i disappears here, it fails because _m is likely to hold a reference as well. In any case I don't see a reason to single out _i for deletion. I suggest just dropping the finally clause altogether to make it clear that we are not promising any finalization beyond what is explicit in the rest of the code. - Jacob ------------------------------------------------------------------------ _i = iter(EXPR) try: try: _y = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _s = yield _y except: _m = getattr(_i, 'throw', None) if _m is not None: _x = sys.exc_info() try: _y = _m(*_x) except StopIteration, _e: if _e is _x[1]: raise else: _r = _e.value break else: _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _s is None: _y = _i.next() else: _y = _i.send(_s) except StopIteration, _e: _r = _e.value break finally: del _i RESULT = _r
Jacob Holm wrote:
Just adding the finally clause to the next try..except..else has the exact same semantics.
True -- I haven't quite got used to the idea that you can do that yet!
In any case I don't see a reason to single out _i for deletion.
That part seems to be a hangover from an earlier version. You're probably right that it can go. -- Greg
Hi Greg There seems to be another issue with GeneratorExit in the latest expansion (reproduced below). Based on the inlining/refactoring principle, I would expect the following code: def inner(): try: yield 1 yield 2 yield 3 except GeneratorExit: val = 'closed' else: val = 'exhausted' return val.upper() def outer(): val = yield from inner() print val To be equivalent to this: def outer(): try: yield 1 yield 2 yield 3 except GeneratorExit: val = 'closed' else: val = 'exhausted' val = val.upper() print val However, with the current expansion they are different. Only the version not using "yield from" will print "CLOSED" in this case: g = outer() g.next() # prints 1 g.close() # should print "CLOSED", but doesn't because the GeneratorExit is reraised by yield-from I currently don't think that a special case for GeneratorExit is needed. Can you give me an example showing that it is? - Jacob ------------------------------------------------------------------------ _i = iter(EXPR) try: _y = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _s = yield _y except: _m = getattr(_i, 'throw', None) if _m is not None: _x = sys.exc_info() try: _y = _m(*_x) except StopIteration, _e: if _e is _x[1] or isinstance(_x[1], GeneratorExit): raise else: _r = _e.value break else: _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _s is None: _y = _i.next() else: _y = _i.send(_s) except StopIteration, _e: _r = _e.value break RESULT = _r
However, with the current expansion they are different. Only the version not using "yield from" will print "CLOSED" in this case:
g = outer() g.next() # prints 1 g.close() # should print "CLOSED", but doesn't because the GeneratorExit is reraised by yield-from
I currently don't think that a special case for GeneratorExit is needed. Can you give me an example showing that it is?
Take your example, replace the "print val" with a "yield val" and you get a broken generator that will yield again when close() is called. Generators that catch and do anything with GeneratorExit other than turn it into StopIteration are almost always going to be broken - the new expression needs to avoid making it easy to do that accidentally. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Generators that catch and do anything with GeneratorExit other than turn it into StopIteration are almost always going to be broken - the new expression needs to avoid making it easy to do that accidentally.
However, as this example shows, the suggested solution of reraising GeneratorExit is not viable because it violates the inlining principle. The basic problem is that there's no way of telling the difference between a StopIteration that means "it's okay, I've finalized myself" and "I really mean to return normally here". -- Greg
While attempting to update the PEP to incorporate a GeneratorReturn exception, I've thought of a potential difficulty in making the exception type depend on whether the return statement had a value. Currently the StopIteration exception is created after the return statement has unwound the stack frame, by which time we've lost track of whether it had an expression. -- Greg
Greg Ewing wrote:
While attempting to update the PEP to incorporate a GeneratorReturn exception, I've thought of a potential difficulty in making the exception type depend on whether the return statement had a value.
Currently the StopIteration exception is created after the return statement has unwound the stack frame, by which time we've lost track of whether it had an expression.
Does it become easier if "return None" raises StopIteration instead of raising GeneratorReturn(None)? I think I'd prefer that to having to perform major surgery on the eval loop to make it do something else... (Guido may have other ideas, obviously). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Sat, Mar 28, 2009 at 8:00 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Greg Ewing wrote:
While attempting to update the PEP to incorporate a GeneratorReturn exception, I've thought of a potential difficulty in making the exception type depend on whether the return statement had a value.
Currently the StopIteration exception is created after the return statement has unwound the stack frame, by which time we've lost track of whether it had an expression.
Does it become easier if "return None" raises StopIteration instead of raising GeneratorReturn(None)?
I think I'd prefer that to having to perform major surgery on the eval loop to make it do something else... (Guido may have other ideas, obviously).
I think my first response on this (yesterday?) already mentioned that I didn't mind so much whether "return None" was treated more like "return" or more like "return <value>". So please do whatever can be implemented easily. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Greg Ewing wrote:
Nick Coghlan wrote:
Generators that catch and do anything with GeneratorExit other than turn it into StopIteration are almost always going to be broken - the new expression needs to avoid making it easy to do that accidentally.
However, as this example shows, the suggested solution of reraising GeneratorExit is not viable because it violates the inlining principle.
The basic problem is that there's no way of telling the difference between a StopIteration that means "it's okay, I've finalized myself" and "I really mean to return normally here".
Well, there is a way to tell the difference - if we just threw GeneratorExit in, then it finalised itself, otherwise it is finishing normally. The only question is what to do in the outer scope in the first case. 1. Accept the StopIteration as a normal termination of the subiterator and continue execution of the delegating generator instead of finalising it. This is very bad as it will lead to any generator that yields again after a yield from expression almost certainly being broken [1]. 2. Reraise the original GeneratorExit. 3. Reraise the subiterator's StopIteration exception. 4. Return immediately from the delegating generator. I actually quite like option 4, as I believe it best reflects what the subiterator has done by trapping GeneratorExit and turning it into "normal" termination of the subiterator, without creating a situation where generators that use yield from a likely to accidentally ignore GeneratorExit. Cheers, Nick. [1] By "broken" in this context, I mean "close() will raise RuntimeError", as would occur if Jacob's example used "yield val" instead of "print val", or as occurs in the following normal generator:
def gen(): ... try: ... yield ... except GeneratorExit: ... pass ... yield ... g = gen() g.next() g.close() Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: generator ignored GeneratorExit
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Greg Ewing wrote:
Nick Coghlan wrote:
Generators that catch and do anything with GeneratorExit other than turn it into StopIteration are almost always going to be broken - the new expression needs to avoid making it easy to do that accidentally.
However, as this example shows, the suggested solution of reraising GeneratorExit is not viable because it violates the inlining principle.
The basic problem is that there's no way of telling the difference between a StopIteration that means "it's okay, I've finalized myself" and "I really mean to return normally here".
Would it be possible to attach the current exception (if any) to the StopIteration/GeneratorReturn raised by a return statement in a finally clause? (Using the __traceback__ and __cause__ attributes from PEP-3134) Then the PEP expansion could check for and reraise the attached exception. Now that I think about it, this is almost required by the inlining/refactoring principle. Consider this example: def inner(): try: yield 1 yield 2 yield 3 finally: return 'VALUE' def outer(): val = yield from inner() print val Which I think should be equivalent to: def outer(): try: yield 1 yield 2 yield 3 finally: val = 'VALUE' print val The problem is that any exception thrown into inner is converted to a GeneratorReturn, which is then swallowed by the yield-from instead of being reraised. - Jacob
Jacob Holm wrote:
Would it be possible to attach the current exception (if any) to the StopIteration/GeneratorReturn raised by a return statement in a finally clause? (Using the __traceback__ and __cause__ attributes from PEP-3134) Then the PEP expansion could check for and reraise the attached exception.
Based on that idea, here is the 3.0-based expansion I propose: _i = iter(EXPR) try: _t = None _y = next(_i) while 1: try: _s = yield _y except BaseException as _e: _t = _e _m = getattr(_i, 'throw', None) if _m is None: raise _y = _m(_t) else: _t = None if _s is None: _y = next(_i) else: _y = _i.send(_s) except StopIteration as _e: if _e is _t: # If _e is the exception that we have just thrown to the subiterator, reraise it. if _m is None: # If there was no "throw" method, explicitly close the iterator before reraising. _m = getattr(_i, 'close', None) if _m is not None: _m() raise if _e.__cause__ is not None: # If the return was from inside a finally clause with an active exception, reraise that exception. raise _e.__cause__ # Normal return RESULT = _e.value I have moved the code around a bit to use fewer try blocks while preserving semantics, then removed the check for GeneratorExit and added a different check for __cause__. Even if the __cause__ idea is shot down, I think I prefer the way this expansion reads. It makes it easier to see at a glance what is part of the loop and what is part of the cleanup. What do you think? - Jacob
Jacob Holm wrote:
The problem is that any exception thrown into inner is converted to a GeneratorReturn, which is then swallowed by the yield-from instead of being reraised.
That actually only happens if inner *catches and suppresses* the thrown in exception. Otherwise throw() will reraise the original exception automatically:
def gen(): ... try: ... yield ... except: ... print "Suppressed" ... g = gen() g.next() g.throw(AssertionError) Suppressed Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration def gen(): ... try: ... yield ... finally: ... print "Not suppressed" ... g = gen() g.next() g.throw(AssertionError) Not suppressed Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in gen AssertionError
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Jacob Holm wrote:
The problem is that any exception thrown into inner is converted to a GeneratorReturn, which is then swallowed by the yield-from instead of being reraised.
That actually only happens if inner *catches and suppresses* the thrown in exception. Having a return in the finally clause like in my example is sufficient to suppress the exception.
Otherwise throw() will reraise the original exception automatically:
I am not sure what your point is. Yes, this is a corner case. I am trying to make sure we have the corner cases working as well. In the example I gave I think it was pretty clear what should happen according to the inlining principle. The suppression of the initial exception is an accidental side effect of the refactoring. It looks to me like using the __cause__ attribute on the GeneratorReturn will allow us to reraise the exception. This seems like exactly the kind of thing that the __cause__ and __context__ attributes from PEP 3134 was designed for. - Jacob
Jacob Holm wrote:
Nick Coghlan wrote:
Jacob Holm wrote:
The problem is that any exception thrown into inner is converted to a GeneratorReturn, which is then swallowed by the yield-from instead of being reraised.
That actually only happens if inner *catches and suppresses* the thrown in exception. Having a return in the finally clause like in my example is sufficient to suppress the exception.
Ah, I did miss that - I think it just means the code has been refactored incorrectly though.
Otherwise throw() will reraise the original exception automatically:
I am not sure what your point is. Yes, this is a corner case. I am trying to make sure we have the corner cases working as well.
I think the refactoring is buggy, because it has changed the code from leaving exceptions alone to suppressing them. Consider what it would mean to do the same refactoring with normal functions: def inner(): try: perform_operation() finally: return 'VALUE' def outer(): val = inner() print val That code does NOT do the same thing as: def outer(): try: perform_operation() finally: val = 'VALUE' print val A better refactoring would keep the return outside the finally clause in the inner generator: Either: def inner(): try: yield 1 yield 2 yield 3 finally: val = 'VALUE' return val Or else: def inner(): try: yield 1 yield 2 yield 3 finally: pass return 'VALUE' Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Jacob Holm wrote:
Having a return in the finally clause like in my example is sufficient to suppress the exception.
Ah, I did miss that - I think it just means the code has been refactored incorrectly though.
Ok
I think the refactoring is buggy, because it has changed the code from leaving exceptions alone to suppressing them. Consider what it would mean to do the same refactoring with normal functions:
def inner(): try: perform_operation() finally: return 'VALUE'
def outer(): val = inner() print val
That code does NOT do the same thing as:
def outer(): try: perform_operation() finally: val = 'VALUE' print val
Good point. Based on this observation, I withdraw the proposal about storing the active exception on the GeneratorReturn and reraising it in yield-from. I still think we should get rid of the check for GeneratorExit, because of the other example I gave. - Jacob
The problem of how to handle GeneratorExit doesn't seem to have any entirely satisfactory solution. On the one hand, the inlining principle requires that we never re-raise it if the subgenerator turns it into a StopIteration (or GeneratorReturn). On the other hand, not re-raising it means that a broken generator can easily result from innocuously combining two things that are individually legitimate. I think we just have to accept this, and state that refactoring only preserves semantics as long as the code block being factored out does not catch GeneratorExit without re-raising it. Then we're free to always re-raise GeneratorExit and prevent broken generators from occurring. I'm inclined to think this situation is a symptom that the idea of being able to catch GeneratorExit at all is flawed. If generator finalization were implemented by means of a forced return, or something equally uncatchable, instead of an exception, we wouldn't have so much of a problem. Earlier I said that I thought GeneratorExit was best regarded as an implementation detail of generators. I'd like to strengthen that statement and say that it should be considered a detail of the *present* implementation of generators, subject to change in future or alternate Pythons. Related to that, I'm starting to come back to my original instinct that GeneratorExit should not be thrown into the subiterator at all. Rather, it should be taken as an indication that the delegating generator is being finalized, and the subiterator's close() method called if it has one. Then there's never any question about whether to re-raise it -- we should always do so. -- Greg
Greg Ewing wrote:
I'm inclined to think this situation is a symptom that the idea of being able to catch GeneratorExit at all is flawed. If generator finalization were implemented by means of a forced return, or something equally uncatchable, instead of an exception, we wouldn't have so much of a problem.
Well, in theory people are meant to be writing "except Exception:" rather than using a bare except or catching BaseException - that's a big part of the reason SystemExit, KeyboardInterrupt and GeneratorExit *aren't* Exception subclasses.
Related to that, I'm starting to come back to my original instinct that GeneratorExit should not be thrown into the subiterator at all. Rather, it should be taken as an indication that the delegating generator is being finalized, and the subiterator's close() method called if it has one. Then there's never any question about whether to re-raise it -- we should always do so.
I think that's a simpler finalisation rule to remember, so I'd be fine with that approach. I don't think we're going to be able to completely eliminate the tricky subtleties from this expression, but we can at least try to keep them as simple as possible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Well, in theory people are meant to be writing "except Exception:" rather than using a bare except or catching BaseException - that's a big part of the reason SystemExit, KeyboardInterrupt and GeneratorExit *aren't* Exception subclasses.
Yes, it probably isn't something people will do very often. But as long as GeneratorExit is documented as an official part of the language, we need to explain how we're dealing with it. BTW, how official *is* it meant to be? There seems to be very little said about it in either the Language or Library Reference. The Library Ref says it's the "exception raised when a generator's close() method is called". The Language Ref says that the close() method "allows finally clauses to run", but doesn't say how that is accomplished. And I can't find throw() mentioned anywhere! -- Greg
On Mon, Mar 30, 2009 at 5:12 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
Well, in theory people are meant to be writing "except Exception:" rather than using a bare except or catching BaseException - that's a big part of the reason SystemExit, KeyboardInterrupt and GeneratorExit *aren't* Exception subclasses.
Yes, it probably isn't something people will do very often. But as long as GeneratorExit is documented as an official part of the language, we need to explain how we're dealing with it.
BTW, how official *is* it meant to be? There seems to be very little said about it in either the Language or Library Reference.
That's one of our many doc bugs. (Maybe someone at the PyCon sprints can fix these?) PEP 342 defines GeneratorExit, inheriting from Exception. However a later change to the code base made it inherit from BaseException.
The Library Ref says it's the "exception raised when a generator's close() method is called". The Language Ref says that the close() method "allows finally clauses to run", but doesn't say how that is accomplished.
And I can't find throw() mentioned anywhere!
Also defined in PEP 342. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Greg Ewing wrote:
Nick Coghlan wrote:
Well, in theory people are meant to be writing "except Exception:" rather than using a bare except or catching BaseException - that's a big part of the reason SystemExit, KeyboardInterrupt and GeneratorExit *aren't* Exception subclasses.
Yes, it probably isn't something people will do very often. But as long as GeneratorExit is documented as an official part of the language, we need to explain how we're dealing with it. As my last (flawed) example shows, it is easy to accidently convert the GeneratorExit (along with any other uncaught exception) to a StopIteration if you are using a finally clause. You don't need to explicitly catch anything. Code that does this should be considered broken. Not so much because it is swallowing GeneratorExit, but because it swallows *any* exception. I don't think we should add special cases to the yield-from semantics to cater for broken code.
I even think it might have been a mistake in PEP 342 to let close swallow StopIteration. It might have been better if a throw to an already-closed generator just raised the thrown exception, and close only swallowed GeneratorExit. That way, you would quickly discover that the generator was swallowing exceptions because a call to close would cause a StopIteration. With that definition, we would consider any generator that did not (under normal conditions) raise GeneratorExit when thrown a GeneratorExit to be broken. Had that been the definition, I think we would long ago have agreed to let yield-from treat GeneratorExit like any other exception. Unfortunately that is not how things work, and I am afraid that changing it would "break" too much code. I put "break" in quotes, because I think most such code is already broken in the sense that it can swallow exceptions that it shouldn't, such as KeyboardInterrupt and SystemExit. Even without changing throw and close, I still think we should forward GeneratorExit like any other exception, and not do anything special to reraise it or call close on the subiterator. To me that sounds like the cleaner solution, and it is what the inlining principle suggests. It is unfortunate that you have to be a bit more careful about not swallowing GeneratorExit, but I think that care is needed anyway to avoid swallowing other exceptions as well.
BTW, how official *is* it meant to be? There seems to be very little said about it in either the Language or Library Reference.
The Library Ref says it's the "exception raised when a generator's close() method is called". The Language Ref says that the close() method "allows finally clauses to run", but doesn't say how that is accomplished.
And I can't find throw() mentioned anywhere!
All the generator methods are described here: http://docs.python.org/reference/expressions.html#yield-expressions - Jacob
Jacob Holm wrote:
Even without changing throw and close, I still think we should forward GeneratorExit like any other exception, and not do anything special to reraise it or call close on the subiterator.
But that allows you to inadvertently create a broken generator by calling another generator that, according to the rules you've just acknowledged we can't change, is behaving correctly. Asking users not to call such generators would require them to have knowledge about the implementation of every generator they call, which I don't think is acceptable. -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
Even without changing throw and close, I still think we should forward GeneratorExit like any other exception, and not do anything special to reraise it or call close on the subiterator.
But that allows you to inadvertently create a broken generator by calling another generator that, according to the rules you've just acknowledged we can't change, is behaving correctly.
According to the rules for generator finalization it might behave correctly. However, in most cases this will be code that is breaking the rule about not catching KeyboardInterrupt and SystemExit. This is broken code IMNSHO, and I don't think we should complicate the yield-from expression to cater for it. Yes there might be existing code that is not broken even by that standard and that still converts GeneratorExit to StopIteration. I don't think that is common enough that we have to care. If you use such a generator in a yield-from expression, you will get a RuntimeError('generator ignored GeneratorExit') on close, telling you that something is wrong.
Asking users not to call such generators would require them to have knowledge about the implementation of every generator they call, which I don't think is acceptable.
I think that getting a RuntimeError on close is sufficient indication that such a generator should not be used in yield-from. That said, I don't really care much either way. Both versions are acceptable to me, and it is your PEP. - Jacob
Jacob Holm wrote:
in most cases this will be code that is breaking the rule about not catching KeyboardInterrupt and SystemExit.
Not necessarily, it could be doing except GeneratorExit: return
If you use such a generator in a yield-from expression, you will get a RuntimeError('generator ignored GeneratorExit') on close, telling you that something is wrong.
But it won't be at all clear *what* is wrong or what to do about it. The caller is making a perfectly ordinary yield-from call, and he's calling what looks to all the world like a perfectly well-behaved iterator. Where's the mistake? Remember that the generator being called may have been written by someone else. The caller may not know anything about its internals or be in a position to fix them if he did.
I think that getting a RuntimeError on close is sufficient indication that such a generator should not be used in yield-from.
But it's a perfectly valid generator by current standards. I don't want to declare some existing class of generators as being second-class citizens with respect to yield-from, especially based on some internal implementation detail unknowable to its caller. -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
in most cases this will be code that is breaking the rule about not catching KeyboardInterrupt and SystemExit.
Not necessarily, it could be doing
except GeneratorExit: return
I said *most* cases, not all. I don't have any proof of this, just a gut feeling that the majority of generators that convert GeneratorExit to StopIteration do so because they are using a return in a finally clause.
If you use such a generator in a yield-from expression, you will get a RuntimeError('generator ignored GeneratorExit') on close, telling you that something is wrong.
But it won't be at all clear *what* is wrong or what to do about it. The caller is making a perfectly ordinary yield-from call, and he's calling what looks to all the world like a perfectly well-behaved iterator. Where's the mistake?
If this was documented in the PEP, I would say the mistake was in using such a generator in yield-from that wasn't the final yield. Note that it is perfectly ok to use such a generator in a yield-from as long as no outer generator yields afterwards.
Remember that the generator being called may have been written by someone else. The caller may not know anything about its internals or be in a position to fix them if he did.
Right, that makes it harder to fix the source of the problem.
I think that getting a RuntimeError on close is sufficient indication that such a generator should not be used in yield-from.
But it's a perfectly valid generator by current standards. I don't want to declare some existing class of generators as being second-class citizens with respect to yield-from, especially based on some internal implementation detail unknowable to its caller.
I get that. As I see it we have the following options, listed in my order of preference: 1. Don't throw GeneratorExit to the subiterator but raise it in the outer generator, and don't explicitly call close. This is the only version where sharing a subgenerator does not require special care. It has the problem that it behaves differently in refcounting and non-refcounting implementations due to the implicit close that would happen after the yield-from in refcounting implementations. It also breaks the inlining principle in the case of throw(GeneratorExit). 2. Do throw GeneratorExit and don't try to reraise it. This is the version that most closely follows the inlining principle. It has the problem that generators that convert GeneratorExit to StopIteration can only be used in a yield-from if none of the outer generators do a yield afterwards. Breaking this rule gives a RuntimeError('generator ignored GeneratorExit') on close. 3. Do throw GeneratorExit to the subiterator, and explicitly reraise it if it was converted to a StopIteration. It has the problem that it breaks the inlining principle for generators that convert GeneratorExit to StopIteration. 4. Don't throw GeneratorExit to the subiterator, instead explicitly call close before raising it in the outer generator. This is the behavior that #1 would have for non-shared generators in a refcounting implementation. Same problem as #3 and hides the GeneratorExit from non-generators. My guess is that your preference is more like 4, 3, 2, 1. #3 is closest to what is in the current PEP, and is probably what it meant to say. (The PEP checks if the thrown exception was GeneratorExit, then does a bare raise instead of raising the thrown exception). - Jacob
Jacob Holm wrote:
My guess is that your preference is more like 4, 3, 2, 1. #3 is closest to what is in the current PEP, and is probably what it meant to say. (The PEP checks if the thrown exception was GeneratorExit, then does a bare raise instead of raising the thrown exception).
4, 3, 2, 1 is the position I've come around to. Since using send(), throw() and close() on a shared subiterator doesn't make any sense, and the whole advantage of the new expression over a for loop is to make it easy to delegate send() throw() and close() correctly, I now believe that shared subiterators are best handled by actually *iterating* over them in a for loop rather than by delegating to them with "yield from". So the fact that a definition of yield from that provides prompt finalisation guarantees isn't friendly to using it with shared subiterators is actually now a *bonus* in my book - it should hopefully serve as a hint to developers that they're misusing the tool. By adopting position 4, I believe the guarantees for the exception handling in the new expression become as simple as possible: - if the subiterator does not provide a throw() method, or the exception thrown in is GeneratorExit, then the subiterator's close() method (if any) is called and the thrown in exception raised in the current frame - otherwise, the exception (including traceback) is passed down to the subiterator's throw() method With these semantics, subiterators will be finalised promptly when the outermost generator is finalised without any special effort on the developer's part and it won't be trivially easy to accidentally suppress GeneratorExit. To my mind, the practical benefits of such an approach are enough to justify the deviation from the general 'inline behaviour' guideline. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
4, 3, 2, 1 is the position I've come around to. [...] What he said.
I think that 4 also has the advantage of raising RuntimeError in the inner generator's close method (using the definition of close provided in PEP 342) when the inner generator doesn't obey the rules for GeneratorExit laid out in PEP 342. Throwing GeneratorExit to the inner generator causes the outer generator's close to report the RuntimeError, which pins the blame on the wrong generator (in the stack traceback, which won't even show the inner generator). -bruce frederiksen
Nick Coghlan wrote:
4, 3, 2, 1 is the position I've come around to.
[...snip...]
By adopting position 4, I believe the guarantees for the exception handling in the new expression become as simple as possible: - if the subiterator does not provide a throw() method, or the exception thrown in is GeneratorExit, then the subiterator's close() method (if any) is called and the thrown in exception raised in the current frame - otherwise, the exception (including traceback) is passed down to the subiterator's throw() method
Below I have attached a heavily annotated version of the expansion that I expect for #4. This version fixes an issue I have forgotten to mention where the subiterator is not closed due to an AttributeError caused by a missing send method.
With these semantics, subiterators will be finalised promptly when the outermost generator is finalised without any special effort on the developer's part and it won't be trivially easy to accidentally suppress GeneratorExit.
The way I see it, it will actually be hard to do even on purpose, unless you are willing to take a significant performance hit by using a non-generator wrapper for every generator.
To my mind, the practical benefits of such an approach are enough to justify the deviation from the general 'inline behaviour' guideline.
I disagree, but it seems like I am the only one here that does. It will eliminate a potential pitfall, but will also remove some behavior that could have been useful, such as the ability to suppress the GeneratorExit if you know what you are doing. - Jacob ------------------------------------------------------------------------ _i = iter(EXPR) # Raises TypeError if not an iterable. try: _x = None # No current exception. _y = _i.__next__() # Guaranteed to be there by iter(). while 1: try: _s = yield _y except BaseException as _e: # An exception was thrown in, either by a call to throw() on the generator or implicitly by a call # to close(). _x = _e # Save the thrown-in exception as current. if isinstance(_x, GeneratorExit): _m = None # Don't forward GeneratorExit. else: _m = getattr(_i, 'throw', None) # Forward any other exception if there is a throw() method. if _m is None: # Not forwarding. Exit loop and go to finally clause (possibly via "except StopIteration"), # which will close _i before reraising _x. raise _y = _m(_x) else: if _s is None: # Either a send(None) or a __next__(), forward as __next__(). _x = None # No current exception _y = _i.__next__() # Guaranteed to be there by iter(). else: # A send(non-None). We need to handle the case where the subiterator has no send() method. try: _m = _i.send except AttributeError as _e: # No send method. Ensure that the subiterator is closed, then reraise the AttributeError. _x = _e # Save the AttributeError as the current exception. _m = None # Clear _m so we know _x has not been forwarded. raise # Exit loop and go to finally clause, which will close _i before reraising _x. else: _x = None # No current exception. _y = _m(s) except StopIteration as _e: if _e is _x: # If _e was just thrown in, reraise it. If the exception has been forwarded to the subiterator, # the subiterator is assumed closed. In that case _m will be non-None, so the subiterator will not be # closed again by the finally clause. Conversely, if the exception was not forwarded _m will be None # and the finally clause takes care of closing it before reraising the exception. raise # Normal return. If we get here, the StopIteration was raised by a __next__(), send() or throw() on the # subiterator which will therefore already be closed. In this case either _x is None or _m is not None, so # the the subiterator will not be closed again by the finally clause. RESULT = _e.value finally: if _x is not None and _m is None: # An exception is active and was not raised by the subiterator. Explicitly call close before the # exception is automatically reraised by the finally clause. If close raises an exception, that will # take over. _m = getattr(_i, 'close', None) if _m is not None: _m()
Jacob Holm wrote:
will also remove some behavior that could have been useful, such as the ability to suppress the GeneratorExit if you know what you are doing.
I'm not convinced there are any use cases for suppressing GeneratorExit in the first place. Can you provide an example that couldn't be easily done some other way? -- Greg
Greg Ewing wrote:
Jacob Holm wrote:
will also remove some behavior that could have been useful, such as the ability to suppress the GeneratorExit if you know what you are doing.
I'm not convinced there are any use cases for suppressing GeneratorExit in the first place. Can you provide an example that couldn't be easily done some other way?
I don't have any real use cases, just a few examples of things you can do in #2 that become a bit uglier in #3 or #4. This: def inner(): try: for i in xrange(10): yield i except GeneratorExit: return i return "all" def outer(): val = yield from inner() print val return val Does not behave like you would expect because the "return i" is swallowed by the call to inner.close() (or is it?) and the "print val" and "return val" statements are skipped due to the reraised GeneratorExit. To get the a value out of the generator being closed you need to raise and catch your own exception: class Return(Exception): pass def inner(): try: for i in xrange(10): yield i except GeneratorExit: raise Return(i) return "all" def outer(): try: val = yield from inner() except Return as r: val = r.args[0] print val return val This is certainly doable, but ugly compared to the version using return. Here is an idea that would help this a little bit. We could change close to return the value (if any) returned by the generator, and then attach that value to the reraised GeneratorExit in the yield-from expansion. That would allow the example to be rewritten as: def inner(): try: for i in xrange(10): yield i except GeneratorExit: return i return "all" def outer(): try: val = yield from inner() except GeneratorExit as e: val = e.value print val return val Which I think is much nicer. - Jacob
Jacob Holm wrote:
Greg Ewing wrote:
Jacob Holm wrote:
will also remove some behavior that could have been useful, such as the ability to suppress the GeneratorExit if you know what you are doing.
I'm not convinced there are any use cases for suppressing GeneratorExit in the first place. Can you provide an example that couldn't be easily done some other way?
I don't have any real use cases, just a few examples of things you can do in #2 that become a bit uglier in #3 or #4.
You appear to be thinking of GeneratorExit as a way to ask a generator to finish normally such that it still makes sense to try to return a value after a GeneratorExit has been thrown in to the current frame, but that really isn't its role. Instead, it's more of an "Abandon Ship! Abandon Ship! All hands to the lifeboats!" indication that gives the generator a chance to release any resources it might be holding and bail out. The reason that close() accepts a StopIteration as well as a GeneratorExit is that the former still indicates that the generator has finalised itself, so the objective of calling close() has been achieved and there is no need to report an error. Any code that catches GeneratorExit without reraising it is highly suspect, just like code that suppresses SystemExit and KeyboardInterrupt. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Jacob Holm wrote:
Greg Ewing wrote:
Jacob Holm wrote:
will also remove some behavior that could have been useful, such as the ability to suppress the GeneratorExit if you know what you are doing.
I'm not convinced there are any use cases for suppressing GeneratorExit in the first place. Can you provide an example that couldn't be easily done some other way?
I don't have any real use cases, just a few examples of things you can do in #2 that become a bit uglier in #3 or #4.
You appear to be thinking of GeneratorExit as a way to ask a generator to finish normally such that it still makes sense to try to return a value after a GeneratorExit has been thrown in to the current frame,
Yes. I am thinking that when using this for refactoring, there are likely to be cases where the closing generator needs to provide some final piece of information to its caller so that the caller can do *its* finalization. Using return for that purpose has a number of control flow advantages. If you insist we shouldn't use return for this, we should make close raise a RuntimeError like this: def close(self): try: self.throw(GeneratorExit) except StopIteration, e: if e.value is not None: raise RuntimeError('generator responded to GeneratorExit by returning with a value') except GeneratorExit: pass else: raise RuntimeError('generator ignored GeneratorExit') Of course I would prefer to use "return e.value" instead of the first RuntimeError, because that seems like the obvious thing to expect when you close a generator containing "try..except GeneratorExit: return value". And once we have close returning a value, it would be nice to have access to that value in the context of the yield-from expression. Attaching it to the GeneratorExit (re)raised by yield-from seems like the only logical choice. As my third code fragment showed, you could then explicitly recatch the GeneratorExit and get the value there.
but that really isn't its role.
Instead, it's more of an "Abandon Ship! Abandon Ship! All hands to the lifeboats!" indication that gives the generator a chance to release any resources it might be holding and bail out.
That might be the prevailing wisdom concerning GeneratorExit, at least partly based on the fact that the only way to communicate anything useful out of a closing generator is to raise another exception. Thinking a bit about coroutines, it would be nice to use "send" for the normal communication and "close" to shut it down and getting a final result. Example: def averager(): count = 0 sum = 0 while 1: try: val = (yield) except GeneratorExit: return sum/count else: sum += val count += 1 avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.close() # prints 1.5 To do something similar today requires either a custom exception, or the use of special values to tell the generator to yield the result. I find this version a lot cleaner.
The reason that close() accepts a StopIteration as well as a GeneratorExit is that the former still indicates that the generator has finalised itself, so the objective of calling close() has been achieved and there is no need to report an error.
I have argued before that accepting StopIteration in close is likely to hide bugs in the closed generator, because the StopIteration may come from a return in a finally clause. However, since we *are* accepting StopIteration we might as well make it useful.
Any code that catches GeneratorExit without reraising it is highly suspect, just like code that suppresses SystemExit and KeyboardInterrupt.
Explicitly catching GeneratorExit and then returning is a valid use today that I wouldn't consider suspect. Catching GeneratorExit and then exiting the except block by other means than a raise or return is suspect, but has valid uses. Best regards, - Jacob
Jacob Holm wrote:
Explicitly catching GeneratorExit and then returning is a valid use today that I wouldn't consider suspect. Catching GeneratorExit and then exiting the except block by other means than a raise or return is suspect, but has valid uses.
What are these valid uses? The PEP 342 definition made some sense originally when GeneratorExit was a subclass of Exception and hence easy to suppress accidentally, but I have serious doubts about the validity of trapping it and turning it into StopIteration now that it has been moved out to inherit directly from BaseException. Regardless, unless Greg goes out of his way to change the meaning of close() in the PEP, GeneratorReturn will escape from close() (since that only traps StopIteration). That means you'll be able to catch that exception directly if you really want to, and if you don't it will bubble up out of the original close() call that was made on the outermost generator. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Jacob Holm wrote:
Explicitly catching GeneratorExit and then returning is a valid use today that I wouldn't consider suspect. Catching GeneratorExit and then exiting the except block by other means than a raise or return is suspect, but has valid uses.
What are these valid uses? The PEP 342 definition made some sense originally when GeneratorExit was a subclass of Exception and hence easy to suppress accidentally, but I have serious doubts about the validity of trapping it and turning it into StopIteration now that it has been moved out to inherit directly from BaseException.
When catching and returning, the control flow is different than if you were catching and raising. Also using "return" more clearly signals the intent to leave the generator than "raise". Even with GeneratorExit inheriting directly from BaseException, it is still easier to intercept an exception than a return. The use for catching and exiting the block normally is to share some code between the cases. Of course you need to be careful when you do that, but it can save some duplication. Both these uses are valid in the sense that the generators work as advertised and follow the rules of finalization as defined by PEP 342. I don't think a proposal for changing close to not accept StopIteration is going to fly.
Regardless, unless Greg goes out of his way to change the meaning of close() in the PEP, GeneratorReturn will escape from close() (since that only traps StopIteration). That means you'll be able to catch that exception directly if you really want to, and if you don't it will bubble up out of the original close() call that was made on the outermost generator.
Hmm. I had almost forgotten about the separate GeneratorReturn exception. It would be good to see how that changes things. So far I consider it a needless complication, but I would like to read a version of the PEP that include it to see how bad it is. As for GeneratorReturn not being caught by close(), I find it really strange if returning a non-None value as a response to GeneratorExit makes close() raise a GeneratorReturn. Whereas returning None makes close finish without an exception. If you think returning a non-None value is an error, we should make it a (subclass of) RuntimeError rather than a GeneratorReturn to clearly indicate this. I am strongly in favor of changing close to return the value rather than letting the GeneratorReturn pass through or raising a RuntimeError. I think the "averager" example I just gave is a good (but simplistic) example of the kind of code I would consider using coroutines for. The need to catch an exception would make that code a lot less readable, not to mention slower. I am not about to write a separate PEP for this, but I would consider "return from generator" plus "close returns value returned from generator" to be a worthwhile addition in itself. - Jacob
Jacob Holm wrote:
def averager(): count = 0 sum = 0 while 1: try: val = (yield) except GeneratorExit: return sum/count else: sum += val count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.close() # prints 1.5
But that's not how it works, unless you're asking Greg to change the PEP to allow that. And while it looks cute a single layer deep like that, it goes wrong as soon as you consider the fact that if you get a GeneratorReturn exception on close(), you *don't know* if that result came from the outer iterator. A better way to write that averager would be: def averager(): # Works for Python 2.5+ count = 0 sum = 0 while 1: val = (yield) if val is None: yield sum/count break sum += val count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.send(None) 1.5
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Jacob Holm wrote:
def averager(): count = 0 sum = 0 while 1: try: val = (yield) except GeneratorExit: return sum/count else: sum += val count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.close() # prints 1.5
But that's not how it works, unless you're asking Greg to change the PEP to allow that. I am most definitely asking Greg to change the PEP to allow that. Specifically I am asking for a clarification in the PEP of how GeneratorReturn/StopIteration is handled in close(), and requesting that we define close() to return the value rather than letting the GeneratorReturn be raised.
And while it looks cute a single layer deep like that, it goes wrong as soon as you consider the fact that if you get a GeneratorReturn exception on close(), you *don't know* if that result came from the outer iterator.
Using option #4 from the list I made of possible finalization strategies which is what most of you seemed to prefer, and assuming that close catches GeneratorReturn/StopIteration you *can* be sure. There is no way it could come from anywhere else in the yield-from stack. Of course you can raise the exception manually or call a function that does, but that is crazy code...
A better way to write that averager would be:
def averager(): # Works for Python 2.5+ count = 0 sum = 0 while 1: val = (yield) if val is None: yield sum/count break sum += val count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.send(None)
1.5
Yes, I am aware that you can pass special values to send. I find this version less appealing than mine for at least the following reasons: 1. You need to use a magic "stop" value (in this case None). 2. You are using the same "send" method for two radically different purposes on the same object. 3. You need a separate "close" step to clean up afterwards (which you forgot). 4. You use "yield" for different purposes at different times (mostly input, then a single output). 5. I find the control flow in mine simpler to understand. It explicitly mentions GeneratorExit, and immediately returns. Yours must check for the magic "stop" value, yield a result, then break/return. #1,2,3 makes the API of the averager object more complex than it needs to be. #4 is generally considered ugly, but is sometimes necessary. #5 is just a personal preference. Best regards - jacob
I've had another idea about this. Suppose the close() method of a generator didn't complain about reaching a yield after GeneratorExit is raised, but simply raised it again, and continued doing so until either a return occured or an exception propagated out. Seems to me this couldn't do any harm to a well- behaved generator, since it has to be prepared to deal with a GeneratorExit arising from any of its yield points. Yield-from would then no longer have the potential to create broken generators, we wouldn't have to treat GeneratorExit differently from any other exception, and Jacob could have his subgenerators that return values when you close them. -- Greg
Greg Ewing wrote:
I've had another idea about this. Suppose the close() method of a generator didn't complain about reaching a yield after GeneratorExit is raised, but simply raised it again, and continued doing so until either a return occured or an exception propagated out.
Seems to me this couldn't do any harm to a well- behaved generator, since it has to be prepared to deal with a GeneratorExit arising from any of its yield points.
Yield-from would then no longer have the potential to create broken generators, we wouldn't have to treat GeneratorExit differently from any other exception, and Jacob could have his subgenerators that return values when you close them.
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up and raising RuntimeError, just so truly broken generators that suppressed GeneratorExit in an infinite loop would eventually trigger an exception rather than just appearing to hang. The basic idea seems sound though (Jacob's averager example really was nicer than mine). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Greg Ewing wrote:
I've had another idea about this. Suppose the close() method of a generator didn't complain about reaching a yield after GeneratorExit is raised, but simply raised it again, and continued doing so until either a return occured or an exception propagated out.
Seems to me this couldn't do any harm to a well- behaved generator, since it has to be prepared to deal with a GeneratorExit arising from any of its yield points.
It solves the returnvalue thing, but introduces a change for existing generators. Well-behaved generators would not be affected, but there might be generators in real use that relied on the ability to ignore close or code using such generators that relied on getting the RuntimeError. <sidetrack> If there is a use case for ignoring close, that would be better served by another new idea I just had, the "yield raise" expression. The purpose of this would be to raise an exception in the caller of "next", "send", "throw" or "close" *without* finalizing the generator. Extending my "averager" example a bit: def averager(start=0): count = 0 exc = None sum = start while 1: try: val = (yield) if exc is None else (yield raise exc) except GeneratorExit: return sum/count try: sum += val except BaseException as e: exc = e # will be reraised by "yield raise" above else: exc = None count += 1 avg = averager() avg.next() # start coroutine avg.send(1.0) try: avg.send('') # this raises a TypeError at the sum += val line, which is rerouted here by the yield raise except TypeError: pass avg.send(2.0) print avg.close() # still prints 1.5 The above code would be the main use for the feature. However, a side benefit would be that a generator that wanted to raise an exception instead of closing could use a "yield raise OtherException" as response to GeneratorExit. I am not saying we should add the "yield raise" feature to the PEP, just that I think this would be a better way to handle the "don't close me" cases. (I am not sure how it would fit into the PEP anyway) </sidetrack>
Greg Ewing wrote:
Yield-from would then no longer have the potential to create broken generators, we wouldn't have to treat GeneratorExit differently from any other exception, and Jacob could have his subgenerators that return values when you close them.
Only true because you have redefined it so that no generators are broken. If I understand you correctly, you are arguing that this change lets us throw GeneratorExit to the subiterator without trying to reraise it (my #2 from several mails back). That is clearly a plus in my book because it adheres to the inlining principle, but I don't think you need the loop in close for it to be better. Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up and raising RuntimeError, just so truly broken generators that suppressed GeneratorExit in an infinite loop would eventually trigger an exception rather than just appearing to hang.
Right. The possibility of turning a call that used to raise a RuntimeError into an infinite loop bothers me a bit. I also don't really see the use for it. GeneratorExit is an unambiguous signal to close, so I would expect the generator to handle it by closing (possibly with a final return value), or by raising an exception. Not doing so *should* be an error. There has been requests for a function that loops over the generator and returns the final result, but this version of close doesn't fit that use case because it uses throw(GeneratorExit) instead of next().
The basic idea seems sound though (Jacob's averager example really was nicer than mine).
Thank you Nick. I am glad you think so. To summarize, I am only +0.75 on this proposal. I think it would be better not to loop, still return the final value from close, and still just throw GeneratorExit to subiterators without trying to reraise. Cheers - Jacob
Jacob Holm wrote:
I think it would be better not to loop, still return the final value from close, and still just throw GeneratorExit to subiterators without trying to reraise. This sounds better to me too, except for the last part -- not reraising GeneratorExit.
If you re-define close to return the value attached to StopIteration, then I think that it makes sense to define it to continue to return this value on subsequent calls to close. This provides a way to still retrieve the returned value after the generator has been finalized in some other way. And then, wouldn't this allow you to discard the StopIteration in yield from and reraise GeneratorExit to finalize the outer generator; but leaving it the option to call close itself on the inner generator to retrieve the return value, if it still wants it? -bruce frederiksen
Bruce Frederiksen wrote:
Jacob Holm wrote:
I think it would be better not to loop, still return the final value from close, and still just throw GeneratorExit to subiterators without trying to reraise. This sounds better to me too, except for the last part -- not reraising GeneratorExit.
If you re-define close to return the value attached to StopIteration, then I think that it makes sense to define it to continue to return this value on subsequent calls to close. This provides a way to still retrieve the returned value after the generator has been finalized in some other way.
If we want close to return the value multiple times we need to store it somewhere. If we are storing it, we might as well do it as a direct result of the return statement instead of pulling it out of the StopIteration. That way we could drop the idea of a GeneratorReturn exception and just always call close on the subiterator to get the value. If we are calling close anyway, we don't need to pass the GeneratorExit to the subiterator ourselves, as close will do it for us. But if we don't pass it to the subiterator, we need to (re)raise it in the yield from. That makes this a slight variation of my #4, which IIRC was a common preference between you and Nick (and probably Greg as well).
And then, wouldn't this allow you to discard the StopIteration in yield from and reraise GeneratorExit to finalize the outer generator; but leaving it the option to call close itself on the inner generator to retrieve the return value, if it still wants it?
Yes it would. This addresses the issue I had about not being able to retrieve the return value after yield-from throws GeneratorExit. At the moment, I can't find any other issues with this version. The only slight drawback is that it has to save the returned value, potentially keeping it alive longer than necessary. This is more than compensated for by making things simpler and allowing more uses, such as using the generator in a for-loop and accessing the return value afterwards. So +1 to this. - Jacob
Bruce Frederiksen wrote:
If you re-define close to return the value attached to StopIteration,
There may be a misconception here. I haven't been intending for close() to return the return value. That's not necessary to support Jacob's desire for the subgenerator to be able to return a value while the outer generator is being closed. That's because the subgenerator would *not* have its close() method called -- rather, GeneratorExit would be thrown into it. If it returned, this would manifest as a GeneratorReturn which could be caught and treated accordingly. I wouldn't necessarily be against having close() return the value from GeneratorReturn, but that's a separate issue to be decided independently. -- Greg
Jacob Holm wrote:
Well-behaved generators would not be affected, but there might be generators in real use that relied on the ability to ignore close or code using such generators that relied on getting the RuntimeError.
I think that's stretching things a bit. To my mind, any code that *relies* on getting a RuntimeError is just perverse -- sort of like saying that the code is only correct if it has a bug in it.
the "yield raise" expression. The purpose of this would be to raise an exception in the caller of "next", "send", "throw" or "close" *without* finalizing the generator. Extending my "averager" example a bit:
Sorry, but your example is now so convoluted that I can't follow it. I would never recommend that anyone write code like that.
Only true because you have redefined it so that no generators are broken.
Not quite -- a generator that gets into an infinite loop catching GeneratorExits would still be broken, you just wouldn't be told about it with an exception. But it's true that the class of non-broken generators would be considerably expanded. I would argue that the generators being included were unfairly classified as broken before, because they simply hadn't been given enough opportunity to finalize themselves.
The possibility of turning a call that used to raise a RuntimeError into an infinite loop bothers me a bit.
I still have trouble believing that this will be a serious problem in practice. I suspect it will occur quite rarely, and if it does occur, you debug it using the usual techniques for diagnosing an infinite loop. Hit Ctrl-C, examine the traceback, and do some poking around.
To summarize, I am only +0.75 on this proposal. I think it would be better not to loop, still return the final value from close, and still just throw GeneratorExit to subiterators without trying to reraise.
But we've established that this combination makes it very easy to create broken generators through no fault of your own. That's not acceptable to my mind. -- Greg
Jacob Holm wrote:
There has been requests for a function that loops over the generator and returns the final result, but this version of close doesn't fit that use case
That has nothing to do with closing behaviour. Such a function wouldn't close() the subgenerator, it would just keep calling next() until it finishes naturally and catch the GeneratorReturn. -- Greg
Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up
Is it really worth singling out this particular way of writing an infinite loop? If you're catching GeneratorExit then you presumably have the need to clean up and exit on your mind, so I don't think this is a likely mistake to make. -- Greg
On Fri, 3 Apr 2009 11:18:25 am Greg Ewing wrote:
Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up
Is it really worth singling out this particular way of writing an infinite loop?
Perhaps I've missed something, but it seems to me that the right limit to use would be the recursion limit, and the right exception to raise would be RecursionError rather than RuntimeError. -- Steven D'Aprano
Steven D'Aprano wrote:
Perhaps I've missed something, but it seems to me that the right limit to use would be the recursion limit, and the right exception to raise would be RecursionError rather than RuntimeError.
I'm not sure about that. The kind of code needed to cause a problem would be something like def i_refuse_to-die(): while 1: try: yield 42 except GeneratorExit: pass which looks more like a plain infinite loop than anything involving recursion, so I think getting a RecursionError would be more confusing than helpful. -- Greg
On 4/2/09, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, 3 Apr 2009 11:18:25 am Greg Ewing wrote:
Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up
Is it really worth singling out this particular way of writing an infinite loop?
generators are trickier, so I would say yes, except that ... Someone who is already intentionally catching and ignoring an Exception may not be in the mood to respond to subtle hints.
Perhaps I've missed something, but it seems to me that the right limit to use would be the recursion limit, and the right exception to raise would be RecursionError rather than RuntimeError.
The recursion limit is normally a way to prevent memory exhaustion. In this case, the stack doesn't grow; it is still a generic "while True: pass" that just happens to bounce between two frames instead of sticking to one. -jJ
Greg Ewing wrote:
Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up
Is it really worth singling out this particular way of writing an infinite loop?
If you're catching GeneratorExit then you presumably have the need to clean up and exit on your mind, so I don't think this is a likely mistake to make.
I came up with a different answer that I like better - don't mess with GeneratorExit and close() at all, and instead provide next_return(), send_return() and throw_return() methods that *expect* to get a GeneratorReturn exception in response (and complain if it doesn't happen). (I expand on this idea in a lot more detail in my reply to Jim) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Fri, Apr 3, 2009 at 6:18 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Greg Ewing wrote:
Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up
Is it really worth singling out this particular way of writing an infinite loop?
If you're catching GeneratorExit then you presumably have the need to clean up and exit on your mind, so I don't think this is a likely mistake to make.
I came up with a different answer that I like better - don't mess with GeneratorExit and close() at all, and instead provide next_return(), send_return() and throw_return() methods that *expect* to get a GeneratorReturn exception in response (and complain if it doesn't happen).
(I expand on this idea in a lot more detail in my reply to Jim)
Since there are so many threads, let me repeat that I'm -1 on adding more methods, but +1 on adding them to the docs as recipes. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
On Fri, Apr 3, 2009 at 6:18 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Greg Ewing wrote:
Nick Coghlan wrote:
I think I'd prefer to see some arbitrary limit (500 seems like a nice round number) on the number of times that GeneratorExit would be thrown before giving up Is it really worth singling out this particular way of writing an infinite loop?
If you're catching GeneratorExit then you presumably have the need to clean up and exit on your mind, so I don't think this is a likely mistake to make. I came up with a different answer that I like better - don't mess with GeneratorExit and close() at all, and instead provide next_return(), send_return() and throw_return() methods that *expect* to get a GeneratorReturn exception in response (and complain if it doesn't happen).
(I expand on this idea in a lot more detail in my reply to Jim)
Since there are so many threads, let me repeat that I'm -1 on adding more methods, but +1 on adding them to the docs as recipes.
If we end up adding a support library for coroutines (depending on how the discussion of the @coroutine problem goes), then that may be another place for them. Fattening the generator API even further bothered me a bit as well, so I'm actually happy to be overruled on that particular idea :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Thu, Apr 2, 2009 at 4:28 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I've had another idea about this. Suppose the close() method of a generator didn't complain about reaching a yield after GeneratorExit is raised, but simply raised it again, and continued doing so until either a return occured or an exception propagated out.
Seems to me this couldn't do any harm to a well- behaved generator, since it has to be prepared to deal with a GeneratorExit arising from any of its yield points.
The feature doesn't exist for the benefit of well-behaved generators. It exists to help people who don't understand generators well enough yet to only write well-behaved ones. This is an important goal to me -- generators are a complex enough topic that I prefer to be on the strict side rather than giving weird code a random meaning.
Yield-from would then no longer have the potential to create broken generators, we wouldn't have to treat GeneratorExit differently from any other exception, and Jacob could have his subgenerators that return values when you close them.
I need a longer description of the problems that you are trying to solve here -- I haven't been able to catch up with all the threads. How would yield-from create a broken generator? (As opposed to all the ways that allowing GeneratorExit to be ignored allows creating broken generators.) Is there an example shorter than a page that shows the usefulness of subgenerators returning values when closed? Please, please, please, we need to stop the bikeshedding and scope expansion, and start converging to a *simpler* proposal. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
The feature doesn't exist for the benefit of well-behaved generators.
I know. I just mean that it won't break any existing correct generators, and should allow reasonably-written future generators to behave reasonably.
It exists to help people who don't understand generators well enough yet to only write well-behaved ones.
If you haven't delved into the details of generators, you won't know about GeneratorExit, so you won't be trying to catch it.
I need a longer description of the problems that you are trying to solve here
It's a bit subtle, but I'll try to recap. Suppose Fred writes the following generator: def fred(): try: yield 1 x = 17 except GeneratorExit: x = 42 print "x =", x By current standards, this is a legitimate generator. Now, the refactoring principle suggests that it should be possible to rewrite it like this: def fred_outer(): x = yield from fred_inner() print "x =", x def fred_inner(): try: yield 1 x = 17 except GeneratorExit: x = 42 return x If we treat GeneratorExit just like any other exception and throw it into the subgenerator, this does in fact work. Now for the problem: Suppose Mary comes along and wants to re-use Fred's inner generator. She writes this: def mary(): y = yield from fred_inner() print "y =", y yield 2 If close() is called on Mary's generator while it's suspended inside the call to fred_inner(), a RuntimeError occurs, because the GeneratorExit got swallowed and Mary tried to do another yield. This is not reasonable behaviour, because Mary didn't do anything wrong. Neither did Fred do anything wrong when he wrote fred_inner() -- it's a perfectly well- behaved generator by current standards. But put the two together and a broken generator results. One way to fix this is to place a small restriction on the refactoring principle: we state that you can't factor out a block of code that catches GeneratorExit and doesn't reraise it before exiting the block. This allows us to treat GeneratorExit as a special case, and always reraise it regardless of what the subiterator does. Mary's generator is then no longer broken. Fred's doesn't work any more, but he can't complain, because he performed an invalid refactoring. My proposal for changing the way close() works is just an alternative way of tackling this problem that would remove the need for special-casing GeneratorExit either in the expansion or the statement of the refactoring principle, and allow generators such as Fred's above to work.
Please, please, please, we need to stop the bikeshedding and scope expansion, and start converging to a *simpler* proposal.
I'm all in favour of simplicity, but it's not clear what is simpler here. There's a tradeoff between complexity in the yield-from expansion and complexity in the behaviour of close(). BTW, if you're after simplicity, I still think that using a different exception to return values from generators, and using a different syntax to do so, are both unnecessary complications. -- Greg
Jacob Holm wrote:
That might be the prevailing wisdom concerning GeneratorExit, at least partly based on the fact that the only way to communicate anything useful out of a closing generator is to raise another exception. Thinking a bit about coroutines, it would be nice to use "send" for the normal communication and "close" to shut it down and getting a final result. Example:
def averager(): count = 0 sum = 0 while 1: try: val = (yield) except GeneratorExit: return sum/count else: sum += val count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.close() # prints 1.5
To do something similar today requires either a custom exception, or the use of special values to tell the generator to yield the result. I find this version a lot cleaner.
This doesn't seem less cleaner than the above to me. def averager(): sum = 0 count = 0 try: while 1: sum += yield count += 1 finally: yield sum / count avg = averager() avg.next() avg.send(1.0) avg.send(2.0) print avg.next() # prints 1.5
On Thu, Apr 2, 2009 at 5:35 PM, Ron Adam <rrr@ronadam.com> wrote:
Jacob Holm wrote:
That might be the prevailing wisdom concerning GeneratorExit, at least partly based on the fact that the only way to communicate anything useful out of a closing generator is to raise another exception. Â Thinking a bit about coroutines, it would be nice to use "send" for the normal communication and "close" to shut it down and getting a final result. Â Example:
def averager():  count = 0  sum = 0  while 1:    try:       val = (yield)    except GeneratorExit:      return sum/count    else:      sum += val      count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.close() Â # prints 1.5
To do something similar today requires either a custom exception, or the use of special values to tell the generator to yield the result. Â I find this version a lot cleaner.
This doesn't seem less cleaner than the above to me.
def averager():   sum = 0   count = 0   try:     while 1:       sum += yield       count += 1   finally:     yield sum / count
avg = averager() avg.next() avg.send(1.0) avg.send(2.0) print avg.next() Â # prints 1.5
But your version isn't clean -- it relies on "sum += yield" raising a TypeError when yield returns None (due to .next() being the same as .send(None)). That's not to say I like Jacob's version that much, but I now understand his use case. I note that Dave Beazley works around this carefully in his tutorial (dabeaz.com/coroutines/) by using examples that produce output on stdout -- and much later, in his multitasking schedule example, his trampoline actually interprets yielding a value that is neither a SystemCall instance nor a generator as a return from a generator. (This is similar to the abuse that your example is giving yield, actually.) I'll have to ponder this more. __________ PS. Somehow the headers in your email made my reply add this: Python-Ideas <public-python-ideas-+ZN9ApsXKcEdnm+yROfE0A@ciao.gmane.org>, Nick Coghlan <public-ncoghlan-Re5JQEeQqe8AvxtiuMwx3w@ciao.gmane.org> Whoever did that, and whatever they did to cause it, please don't do it again. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
On Thu, Apr 2, 2009 at 5:35 PM, Ron Adam <rrr@ronadam.com> wrote:
Jacob Holm wrote:
That might be the prevailing wisdom concerning GeneratorExit, at least partly based on the fact that the only way to communicate anything useful out of a closing generator is to raise another exception. Thinking a bit about coroutines, it would be nice to use "send" for the normal communication and "close" to shut it down and getting a final result. Example:
def averager(): count = 0 sum = 0 while 1: try: val = (yield) except GeneratorExit: return sum/count else: sum += val count += 1
avg = averager() avg.next() # start coroutine avg.send(1.0) avg.send(2.0) print avg.close() # prints 1.5
To do something similar today requires either a custom exception, or the use of special values to tell the generator to yield the result. I find this version a lot cleaner. This doesn't seem less cleaner than the above to me.
def averager(): sum = 0 count = 0 try: while 1: sum += yield count += 1 finally: yield sum / count
avg = averager() avg.next() avg.send(1.0) avg.send(2.0) print avg.next() # prints 1.5
But your version isn't clean -- it relies on "sum += yield" raising a TypeError when yield returns None (due to .next() being the same as .send(None)).
Something I noticed is that function and method calls use TypeError in cases where the argument count is miss matched. If there was a different exception for miss matched arguments and .next() sent no arguments, it could be rewritten in a somewhat cleaner way. def averager(): sum = 0 count = 0 try: while 1: sum += yield count += 1 except ArgumentCountError: yield sum / count avg = averager() avg.next() avg.send(1.0) avg.send(2.0) print avg.next() # prints 1.5 This seems to me that a different exception for miss-matched arguments might be useful in a more general way.
That's not to say I like Jacob's version that much, but I now understand his use case. I note that Dave Beazley works around this carefully in his tutorial (dabeaz.com/coroutines/) by using examples that produce output on stdout -- and much later, in his multitasking schedule example, his trampoline actually interprets yielding a value that is neither a SystemCall instance nor a generator as a return from a generator. (This is similar to the abuse that your example is giving yield, actually.) I'll have to ponder this more.
Yes, generators seem very limiting to me. Generators with more complex input and output requirements are bound to have added complexities to offset the limits of a single sequential i/o channel.
__________ PS. Somehow the headers in your email made my reply add this:
Python-Ideas <public-python-ideas-+ZN9ApsXKcEdnm+yROfE0A@ciao.gmane.org>, Nick Coghlan <public-ncoghlan-Re5JQEeQqe8AvxtiuMwx3w@ciao.gmane.org>
Whoever did that, and whatever they did to cause it, please don't do it again.
I'll manually remove the extra email address's until I (or someone else) can explain why they do that. I hope that's enough for now. I read several python groups though gmane using Thunderbird on Ubuntu, Python-ideas is the only one the email address's are changed like that. (?) Ron
On Wed, Apr 1, 2009 at 4:28 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
You appear to be thinking of GeneratorExit as a way to ask a generator to finish normally such that it still makes sense to try to return a value after a GeneratorExit has been thrown in to the current frame, but that really isn't its role.
Instead, it's more of an "Abandon Ship! Abandon Ship! All hands to the lifeboats!" indication that gives the generator a chance to release any resources it might be holding and bail out. The reason that close() accepts a StopIteration as well as a GeneratorExit is that the former still indicates that the generator has finalised itself, so the objective of calling close() has been achieved and there is no need to report an error.
Any code that catches GeneratorExit without reraising it is highly suspect, just like code that suppresses SystemExit and KeyboardInterrupt.
Let's make that "without either returning from the generator without yielding any more values, raising StopIteration, or re-raising GeneratorExit." At least one example in PEP 342 catches GeneratorExit and returns. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Nick Coghlan wrote:
Any code that catches GeneratorExit without reraising it is highly suspect, just like code that suppresses SystemExit and KeyboardInterrupt.
As another perspective on this, I think Jacob's example is another case of bogus refactoring. If you think about it from the refactoring direction, you start with something that catches GeneratorExit, does some cleanup, and returns. That's fine. But then you try to chop out just the part that catches the GeneratorExit, without doing anything to ensure that the main generator still returns afterwards. This is analogous to taking a block of code containing a 'return' out of an ordinary function and putting it in another function. If that's all you do, you can't expect it to have the same result, because it only returns from the inner function, not the outer one. To correctly refactor Jacob's example, you need to maintain an 'except GeneratorExit' in the main generator somehow. Like 'return', it's not something you can freely move across a refactoring boundary. -- Greg
On Wed, Apr 1, 2009 at 2:30 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
As another perspective on this, I think Jacob's example is another case of bogus refactoring.
Be that as it may, I wonder if we shouldn't back off from the refactoring use case a bit and instead just ponder the different types of code you can write using generators. There's the traditional "pull" style (iterators), "push" style (like the averaging example), and then there are "tasks". (Have you read Dave Beazley's couroutines tutorial yet? Or am I the only one who likes it? :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
I wonder if we shouldn't back off from the refactoring use case a bit and instead just ponder the different types of code you can write using generators. There's the traditional "pull" style (iterators), "push" style (like the averaging example), and then there are "tasks".
I'm not sure how to respond to that, because the whole issue at stake is whether a certain kind of refactoring should be considered legal. It's orthogonal to whether you're using push/pull/task style generators.
(Have you read Dave Beazley's couroutines tutorial yet? Or am I the only one who likes it? :-)
Yes, I've read it, and I quite like it too. As for where yield-from fits into it, mainly it would be in section 8, where it would eliminate the need for trampolining to handle calls/returns. It doesn't directly help with pipelines of coroutines, because you're processing the values at each step rather than just passing them through. But it would enable a single step of the pipeline to be spread over more than one function more easily (something he refrains from doing at that stage in the tutorial, because it would require the trampolining technique that he doesn't develop until later). -- Greg
Guido van Rossum wrote:
On Wed, Apr 1, 2009 at 2:30 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
As another perspective on this, I think Jacob's example is another case of bogus refactoring.
Be that as it may, I wonder if we shouldn't back off from the refactoring use case a bit and instead just ponder the different types of code you can write using generators. There's the traditional "pull" style (iterators), "push" style (like the averaging example), and then there are "tasks". (Have you read Dave Beazley's couroutines tutorial yet? Or am I the only one who likes it? :-)
I liked it so much that I posted the url to the Python list, where others gave it a positive response also. Unlike his previous intro talk, it was not something to breeze through in an evening, so I bookmarked it. Thanks for mentioning it. Terry
On Sun, Mar 29, 2009 at 10:45 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
The problem of how to handle GeneratorExit doesn't seem to have any entirely satisfactory solution.
On the one hand, the inlining principle requires that we never re-raise it if the subgenerator turns it into a StopIteration (or GeneratorReturn).
On the other hand, not re-raising it means that a broken generator can easily result from innocuously combining two things that are individually legitimate.
I think we just have to accept this, and state that refactoring only preserves semantics as long as the code block being factored out does not catch GeneratorExit without re-raising it. Then we're free to always re-raise GeneratorExit and prevent broken generators from occurring.
I'm inclined to think this situation is a symptom that the idea of being able to catch GeneratorExit at all is flawed. If generator finalization were implemented by means of a forced return, or something equally uncatchable, instead of an exception, we wouldn't have so much of a problem.
Earlier I said that I thought GeneratorExit was best regarded as an implementation detail of generators. I'd like to strengthen that statement and say that it should be considered a detail of the *present* implementation of generators, subject to change in future or alternate Pythons.
Related to that, I'm starting to come back to my original instinct that GeneratorExit should not be thrown into the subiterator at all. Rather, it should be taken as an indication that the delegating generator is being finalized, and the subiterator's close() method called if it has one. Then there's never any question about whether to re-raise it -- we should always do so.
This sounds fine -- though somehow I have a feeling nobody will really care either way, and when it causees a problem, it's going to cost an afternoon of debugging regardless. So do what's easiest to implement, we can always fix it later. BTW, I'd really like it if you (and others interested in PEP 380) read Dave Beazley's excellent coroutines tutorial (http://dabeaz.com/coroutines/), and commented on how yield-from can make his example code easier to write or faster. The tutorial comes with ample warnings about its mind-bending nature but I found it excellently written and very clear on the three different use cases for yield: iteration, receiving messages, and "traps" (cooperative multitasking). I cannot plug this enough. (Thanks Jeremy Hylton for mentioning it to me.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
BTW, I'd really like it if you (and others interested in PEP 380) read Dave Beazley's excellent coroutines tutorial (http://dabeaz.com/coroutines/), and commented on how yield-from can make his example code easier to write or faster.
The place where yield-from enters the picture would be in Part 8 ("The Problem with the Stack"), where it would eliminate the need for the scheduler to do trampolining of calls and returns. The idea of yield being like a system call is an interesting perspective. Imagine what it would be like if ordinary programs had to make system calls every time they wanted to call or return from a function! That's the situation we have now when using generators as coroutines, and it's the problem that yield-from addresses. -- Greg
I've just thought of another possible alternative name for yield-from: y = gcall f(x) -- Greg
On Wed, Apr 1, 2009 at 3:29 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I've just thought of another possible alternative name for yield-from:
 y = gcall f(x)
Makes me think of a google hotline, or a typo of gcal. $\pm 0$ -- Cheers, Leif
Greg Ewing wrote:
I've just thought of another possible alternative name for yield-from:
y = gcall f(x)
However, you would lose the common mnemonic with yield for both turning the current function into a generator and indicating to the reader that the current frame may get suspended at a particular point. While using 'yield from' obscures the new 'generator calling' aspect of the new expression, it maintains the use of yield to indicate both "this is a generator" and "this frame may get suspended here for an arbitrarily long period of time". "'yield from' is like calling a generator" may be a slightly odd spelling of the concept, it is at least still memorable and fairly easy to learn - while you aren't likely to guess all the details of what it does a priori, you're unlikely to forget what it does after you have learnt it the first time. That strikes me as a much better option than asking everyone to learn new rules for what can turn a normal function into a generator. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On 4/1/09, Nick Coghlan <ncoghlan@gmail.com> wrote:
Greg Ewing wrote:
I've just thought of another possible alternative name for yield-from:
y = gcall f(x)
However, you would lose the common mnemonic with yield for both turning the current function into a generator and indicating to the reader that the current frame may get suspended at a particular point.
If the "gencall" exhausts the generator f (as "yield from" normally does), then the current frame shouldn't be suspended any more than it would be for an ordinary function call. If the "return value" of the generator really is important (and the intermediate values are discarded), then this probably is the right syntax. (Whether that case is worth syntax is a matter of taste, but it does seem to be a case Greg is trying to support.) -jJ
Jim Jewett wrote:
On 4/1/09, Nick Coghlan <ncoghlan@gmail.com> wrote:
Greg Ewing wrote:
I've just thought of another possible alternative name for yield-from:
y = gcall f(x)
However, you would lose the common mnemonic with yield for both turning the current function into a generator and indicating to the reader that the current frame may get suspended at a particular point.
If the "gencall" exhausts the generator f (as "yield from" normally does), then the current frame shouldn't be suspended any more than it would be for an ordinary function call. If the "return value" of the generator really is important (and the intermediate values are discarded), then this probably is the right syntax. (Whether that case is worth syntax is a matter of taste, but it does seem to be a case Greg is trying to support.)
The intermediate values aren't necessarily discarded by "yield from" though: they're passed out to whoever is consuming the values yielded by the outermost generator. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Wed, Apr 1, 2009 at 12:29 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I've just thought of another possible alternative name for yield-from:
 y = gcall f(x)
Nice April Fool. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
y = gcall f(x)
Nice April Fool. :-)
Actually, it wasn't meant to be -- it was a serious suggestion (it's not 1 April any more where I am). I suppose I'll have to post it again tomorrow before you'll believe that, though... -- Greg
On Wed, Apr 1, 2009 at 10:36 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Guido van Rossum wrote:
y = gcall f(x)
Nice April Fool. :-)
Actually, it wasn't meant to be -- it was a serious suggestion (it's not 1 April any more where I am).
I suppose I'll have to post it again tomorrow before you'll believe that, though...
Well, no matter what, I think it's a bad name. Let's stick with 'yield from'. I'm also returning to the view that return from a generator used as a task (in Dave Beazley's terms) should be spelled differently than a plain return. Phillip Eby's 'return from yield with' may be a little long, but how about 'return from'? -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Bruce Frederiksen wrote:
Guido van Rossum wrote:
but how about 'return from'? or 'return finally'?(??) ...
Or what about "yield return"? That clearly marks the construct as belonging in a generator. It also mixes well with the idea of a "yield raise" that I mentioned in another mail (not a suggestion for this PEP). - Jacob
On Thu, Apr 2, 2009 at 11:37 AM, Jacob Holm <jh@improva.dk> wrote:
Bruce Frederiksen wrote:
Guido van Rossum wrote:
but how about 'return from'?
or 'return finally'?(??) ...
Or what about "yield return"? That clearly marks the construct as belonging in a generator. It also mixes well with the idea of a "yield raise" that I mentioned in another mail (not a suggestion for this PEP).
Not totally weird. After all Dave Beazley's trampoline uses "yield g()" to call a sub-generator and "yield x" to return a value x from a sub-generator to the calling generator via the trampoline's stack... -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
On Thu, Apr 2, 2009 at 11:37 AM, Jacob Holm <jh@improva.dk> wrote:
Bruce Frederiksen wrote:
Guido van Rossum wrote:
but how about 'return from'? or 'return finally'?(??) ...
Or what about "yield return"? That clearly marks the construct as belonging in a generator. It also mixes well with the idea of a "yield raise" that I mentioned in another mail (not a suggestion for this PEP).
Not totally weird. After all Dave Beazley's trampoline uses "yield g()" to call a sub-generator and "yield x" to return a value x from a sub-generator to the calling generator via the trampoline's stack...
Using 'yield return' rather than a bare return wouldn't get any objections from me. As has been said before, the current SyntaxError definitely makes it easier to learn some of the ins and outs of generators. That would leave us with: 'yield': pass a value back to and receive a value from this generator's client 'yield from': pass control to a subgenerator and receive a value back from it 'yield return': finish this generator with GeneratorReturn 'return': finish this generator with StopIteration I think that leaves us with one remaining question: should we save the return value on the generator iterator and make it available as the return value of the close() method? My inclination is that finalising a generator in a way that allows the return value to be retrieved should be left out of the PEP for now, as it is something that can be: a) easily done externally to the generator* b) added to close() later if we decide it would be a good idea In order to leave that avenue open in the future however, close() must be defined in the PEP to trap GeneratorReturn as well as StopIteration. So +1 to having close() accept GeneratorReturn as a legitimate reaction to being thrown GeneratorExit, but -0 on saving the return value on the generator iterator object (at least in the initial version of the PEP) Cheers, Nick. * For example: def get_result_send(self, g, sentinel=None): # Using a sentinel to tell the generator to finish try: while 1: g.send(sentinel) return None except GeneratorReturn as gr: return gr.value def get_result_throw(self, g, sentinel=GeneratorExit): # Using GeneratorExit to tell the generator to finish try: while 1: try: g.throw(sentinel) except sentinel: break return None except GeneratorReturn as gr: return gr.value -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Using 'yield return' rather than a bare return wouldn't get any objections from me. As has been said before, the current SyntaxError definitely makes it easier to learn some of the ins and outs of generators.
That would leave us with:
'yield': pass a value back to and receive a value from this generator's client 'yield from': pass control to a subgenerator and receive a value back from it 'yield return': finish this generator with GeneratorReturn 'return': finish this generator with StopIteration
FWIW, I still don't see the need for a GeneratorReturn exception. I don't understand why it should be an error to ignore the return value, or to loop over a generator that returns a value. I assume it makes sense to someone since it is being discussed, so perhaps one of you someones would care to explain it?
I think that leaves us with one remaining question: should we save the return value on the generator iterator and make it available as the return value of the close() method?
I think so, yes. It makes a big difference to some of the examples I have shown.
My inclination is that finalising a generator in a way that allows the return value to be retrieved should be left out of the PEP for now, as it is something that can be: a) easily done externally to the generator*
Yes, you can hack your way around most limitations. In this case if you need the feature it makes quite a big difference to both the calling and the called code.
b) added to close() later if we decide it would be a good idea
That is true, but I think the semantics of "yield-from" becomes more coherent if we do it now. Alternatively, we could drop the "yield return" idea from the proposal and make "yield from" a statement. I would hate to see it go because coupled with returning the value from close it has some really nice uses, but that would be the other way I see to make the proposal coherent. Having "yield return" without returning the value from close just feels wrong.
In order to leave that avenue open in the future however, close() must be defined in the PEP to trap GeneratorReturn as well as StopIteration.
But if we do that without storing the value and returning it on the next close, you cannot use "yield return" as a response to GeneratorExit in a subiterator without losing the returned value. (This of course depends on how we end up handling GeneratorExit and close in the yield-from expression). Instead you will need to manually raise a different exception in the subiterator. And if you do that, the resulting generator can no longer be closed *without* some wrapper to catch the exception.
So +1 to having close() accept GeneratorReturn as a legitimate reaction to being thrown GeneratorExit, but -0 on saving the return value on the generator iterator object (at least in the initial version of the PEP)
And I am +1/+1 on this, although I would rather see the "yield return" statement just storing the value directly on the generator, raising a normal StopIteration, and not using a GeneratorReturn exception at all. - Jacob
On Thu, Apr 2, 2009 at 4:14 PM, Jacob Holm <jh@improva.dk> wrote:
Nick Coghlan wrote:
Using 'yield return' rather than a bare return wouldn't get any objections from me. As has been said before, the current SyntaxError definitely makes it easier to learn some of the ins and outs of generators.
That would leave us with:
'yield': pass a value back to and receive a value from this generator's client 'yield from': pass control to a subgenerator and receive a value back from it 'yield return': finish this generator with GeneratorReturn 'return': finish this generator with StopIteration
FWIW, I still don't see the need for a GeneratorReturn exception. Â I don't understand why it should be an error to ignore the return value, or to loop over a generator that returns a value. Â I assume it makes sense to someone since it is being discussed, so perhaps one of you someones would care to explain it?
I've explained this more than once in some of the many yield-from threads, but since I am myself asking for a summary of previous threads I'll explain it again. Generators are a mind-bendingly complex issue and it's easy for someone who is just starting to write a generator for the first time to get a detail or two wrong. We intentionally decided to make "return <value>" invalid syntax in a generator to help those people. Surely it would have been easier to code if we just ignored the value. But we went the extra mile to help people negotiate the steep learning curve. I don't want to lose this.
I think that leaves us with one remaining question: should we save the return value on the generator iterator and make it available as the return value of the close() method?
I think so, yes. Â It makes a big difference to some of the examples I have shown.
I must have missed that.
My inclination is that finalising a generator in a way that allows the return value to be retrieved should be left out of the PEP for now, as it is something that can be: a) easily done externally to the generator*
Yes, you can hack your way around most limitations. Â In this case if you need the feature it makes quite a big difference to both the calling and the called code.
b) added to close() later if we decide it would be a good idea
That is true, but I think the semantics of "yield-from" becomes more coherent if we do it now. Â Alternatively, we could drop the "yield return" idea from the proposal and make "yield from" a statement. Â I would hate to see it go because coupled with returning the value from close it has some really nice uses, but that would be the other way I see to make the proposal coherent. Â Having "yield return" without returning the value from close just feels wrong.
In order to leave that avenue open in the future however, close() must be defined in the PEP to trap GeneratorReturn as well as StopIteration.
But if we do that without storing the value and returning it on the next close, you cannot use "yield return" as a response to GeneratorExit in a subiterator without losing the returned value. Â (This of course depends on how we end up handling GeneratorExit and close in the yield-from expression). Â Instead you will need to manually raise a different exception in the subiterator. Â And if you do that, the resulting generator can no longer be closed *without* some wrapper to catch the exception.
So +1 to having close() accept GeneratorReturn as a legitimate reaction to being thrown GeneratorExit, but -0 on saving the return value on the generator iterator object (at least in the initial version of the PEP)
And I am +1/+1 on this, although I would rather see the "yield return" statement just storing the value directly on the generator, raising a normal StopIteration, and not using a GeneratorReturn exception at all.
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
Jacob Holm wrote:
Bruce Frederiksen wrote:
Guido van Rossum wrote:
but how about 'return from'? or 'return finally'?(??) ...
Or what about "yield return"? That clearly marks the construct as belonging in a generator. It also mixes well with the idea of a "yield raise" that I mentioned in another mail (not a suggestion for this PEP). Another strange one: 'close with X'.
This hinges on the 'close' method returning X and also that this could be done syntactically *without* making 'close' a reserved word by relying on 'with' already being a reserved word with very limited usage in the grammar: generator_return_stmt: NAME 'with' testlist And then verify later that NAME is 'close' (or raise SyntaxError). I'm not that familiar with Python's parser to know if it could handle this or not (LL vs. LR)... -bruce frederiksen
On Thu, Apr 2, 2009 at 12:35 PM, Bruce Frederiksen <dangyogi@gmail.com> wrote:
Another strange one: 'close with X'.
This hinges on the 'close' method returning X and also that this could be done syntactically *without* making 'close' a reserved word by relying on 'with' already being a reserved word with very limited usage in the grammar:
 generator_return_stmt: NAME 'with' testlist
And then verify later that NAME is 'close' (or raise SyntaxError).
I'm not that familiar with Python's parser to know if it could handle this or not (LL vs. LR)...
The current parser generator cannot deal with this -- when it sees a NAME at the start of the line it has to decide which non-terminal to pick to parse the rest of the line. Besides (before you put effort into trying to fix this or prove me wrong) this syntax looks too weird -- we don't normally refer to leaving a stack frame as "closing" anything. Close is a verb we apply to other things, e.g. files -- or generators. But not the current frame or generator. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
I'm also returning to the view that return from a generator used as a task (in Dave Beazley's terms) should be spelled differently than a plain return.
Oh, no... I thought you wanted to avoid a bikeshed discussion on that...
Phillip Eby's 'return from yield with' may be a little long, but how about 'return from'?
That makes no sense. The thing after the 'from' isn't what you're returning from! -- Greg
Jacob Holm wrote:
I currently don't think that a special case for GeneratorExit is needed. Can you give me an example showing that it is?
Someone said something that made me think it was needed, but I think you're right, it shouldn't be there. -- Greg
Nick Coghlan wrote:
That is, I now believe the 'normal' case for 'yield from' should be modelled on basic iteration, which means no implicit finalisation.
Now, keep in mind that in parallel with this I am now saying that *all* exceptions, *including GeneratorExit* should be passed down to the subiterator if it has a throw() method.
But those two things are contradictory. In a refcounting Python implementation, dropping the last reference to the delegating generator will cause it to close() itself, thus throwing a GeneratorExit into the subiterator. If other references to the subiterator still exist, this means it gets prematurely finalized.
With an expansion of that form, you can easily make arbitrary iterators (including generators) shareable by wrapping them in an iterator with no throw or send methods:
But if you need explicit wrappers to prevent finalization, then you hardly have "no implicit finalization". So I'm a bit confused about what behaviour you're really asking for. -- Greg
Greg Ewing wrote:
But if you need explicit wrappers to prevent finalization, then you hardly have "no implicit finalization". So I'm a bit confused about what behaviour you're really asking for.
I should have said no *new* mechanism for implicit finalisation. Deletion of the outer generator would, as you say, still call close() and throw GeneratorExit in. I like it because the rules are simple: either an exception is thrown in and passed down to the subiterator (which may have the effect of finalising it), or else the subiterator is left alone (to be finalised either explicitly or implicitly when it is deleted). There's then no special case along the lines of "if GeneratorExit is passed in we just drop our reference to the subiterator instead of passing the exception down", or "if you iterate over a subiterator using 'yield from' instead of a for loop then the subiterator will automatically be closed at the end of the expression". No matter what you do with regards to finalisation, you're going to demand extra work from somebody. The simple rule means that subiterators will see all exceptions (even GeneratorExit), allowing them to handle their own finalisation needs, while shareable subiterators are also possible so long as they don't have throw() methods. The idea of a shareable iterator that *does* support send() or throw() just doesn't make any sense to me. Splitting up a data feed amongst multiple peer consumers, OK, that's fairly straightforward and I can easily imagine uses for it in a generator based coding style (e.g. having multiple clients pulling requests from a job queue). But having multiple peer writers attempting to feed values or exceptions back into that single iterator that can neither tell which writer a particular value or exception came from, nor direct results to particular consumers? That sounds like utter insanity. If you want to create a shareable iterator, preventing use of send() and throw() strikes me as a *very* good idea. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
I like it because the rules are simple: either an exception is thrown in and passed down to the subiterator (which may have the effect of finalising it), or else the subiterator is left alone (to be finalised either explicitly or implicitly when it is deleted).
Okay, so you're in favour of accepting the risk of prematurely finalizing shared subiterators, on the grounds that it can be prevented using a wrapper in the rare cases where it matters. I can live with that, and in fact it's more or less where my most recent thinking has been leading me.
I like it because the rules are simple: either an exception is thrown in and passed down to the subiterator (which may have the effect of finalising it), or else the subiterator is left alone (to be finalised either explicitly or implicitly when it is deleted).
We might still want one special case. If GeneratorExit is thrown and the subiterator has no throw() or the GeneratorExit propagates back out of the throw(), I think an attempt should be made to close() it. Otherwise, explicitly closing the delegating generator wouldn't be guaranteed to finalize the subiterator unless it had a throw() method, whereas one would expect having close() to be sufficient for this. -- Greg
Greg Ewing wrote:
We might still want one special case. If GeneratorExit is thrown and the subiterator has no throw() or the GeneratorExit propagates back out of the throw(), I think an attempt should be made to close() it. Otherwise, explicitly closing the delegating generator wouldn't be guaranteed to finalize the subiterator unless it had a throw() method, whereas one would expect having close() to be sufficient for this.
I'm not so sure about that - we don't do it for normal iteration, so why would we do it for the new expression? However, I've been pondering the shareable iterator case a bit more, and in trying to come up with even a toy example, I couldn't think of anything that wouldn't be better handled just by actually *iterating* over the shared iterator with a for loop. Since the main advantage that the new expression has over simple iteration is delegating send() and throw() correctly, and I'm suggesting that shared iterators and those two methods don't mix, perhaps this whole issue can be set aside? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
I'm not so sure about that - we don't do it for normal iteration, so why would we do it for the new expression?
Because of the inlining principle. If you inline a subgenerator, the result is just a single generator, and closing it finalizes the whole thing.
Since the main advantage that the new expression has over simple iteration is delegating send() and throw() correctly, and I'm suggesting that shared iterators and those two methods don't mix, perhaps this whole issue can be set aside?
Sounds good to me. -- Greg
Greg Ewing wrote:
Nick Coghlan wrote:
I'm not so sure about that - we don't do it for normal iteration, so why would we do it for the new expression?
Because of the inlining principle. If you inline a subgenerator, the result is just a single generator, and closing it finalizes the whole thing.
That makes perfect sense to me as a justification for treating GeneratorExit the same as any other exception (i.e. delegating it to the subgenerator). It doesn't lead me to think that the semantics ever need to involve calling close(). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
That makes perfect sense to me as a justification for treating GeneratorExit the same as any other exception (i.e. delegating it to the subgenerator). It doesn't lead me to think that the semantics ever need to involve calling close().
I'm also treating close() and throw(GeneratorExit) on the delegating generator as equivalent for finalization purposes. So if throw(GeneratorExit) doesn't fall back to close() on the subiterator, closing the delegating generator won't finalize the subiterator unless it pretends to be a generator by implementing throw(). Since the inlining principle strictly only applies to subgenerators, it doesn't *require* this behaviour, but to my mind it strongly suggests it. -- Greg
Greg Ewing wrote:
Nick Coghlan wrote:
That makes perfect sense to me as a justification for treating GeneratorExit the same as any other exception (i.e. delegating it to the subgenerator). It doesn't lead me to think that the semantics ever need to involve calling close().
I'm also treating close() and throw(GeneratorExit) on the delegating generator as equivalent for finalization purposes. So if throw(GeneratorExit) doesn't fall back to close() on the subiterator, closing the delegating generator won't finalize the subiterator unless it pretends to be a generator by implementing throw().
Since the inlining principle strictly only applies to subgenerators, it doesn't *require* this behaviour, but to my mind it strongly suggests it.
I believe I already said this at some point, but after realising that shareable subiterators are almost still going to be better handled by iterating over them rather than delegating to them, I'm actually not too worried one way or the other. While I do still have a slight preference for limiting the methods involved in generator delegation to just next(), send() and throw(), I won't object strenuously to accepting close() as an alternative spelling of throw(exc) that will always reraise the passed in exception. As you say, it does make it easier to write a non-generator delegation target, since implementing close() for finalisation means not having to deal with the vagaries of correctly reraising exceptions. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
As you say, it does make it easier to write a non-generator delegation target, since implementing close() for finalisation means not having to deal with the vagaries of correctly reraising exceptions.
It also means that existing things with a close method, such as files, can be used without change. Having a close method is a fairly well-established way to make an iterator explicitly finalizable, whereas having a throw method isn't. -- Greg
Greg Ewing wrote:
Nick Coghlan wrote:
As you say, it does make it easier to write a non-generator delegation target, since implementing close() for finalisation means not having to deal with the vagaries of correctly reraising exceptions.
It also means that existing things with a close method, such as files, can be used without change.
Having a close method is a fairly well-established way to make an iterator explicitly finalizable, whereas having a throw method isn't.
But then we're back to the point that if someone *wants* deterministic finalisation, then that's why the with statement exists. The part that isn't clicking for me is that I still don't understand *why* 'yield from' should include implicit finalisation as part of its definition. The full delegation of next(), send() and throw() I get completely (since that's the whole point of the new expression). The fact that that *also* ends up delegating the close() method of generators in particular also makes sense (as it's a natural consequence of delegating the first three methods). It's the generalisation of that to all other iterators that happen to offer a close() method that seems somewhat arbitrary. Other than the fact that generators happen to provide a close() method that invokes throw(), it appears to have nothing to do with generator delegation and hence seems like a fairly random addition to the PEP. Using a file as the subiterator is an interesting case in point (and perhaps an interesting exploration as to when a shareable subiterator may make sense: if a subiterator offers separate reading and writing APIs, then those can be exposed as separate generators): class YieldingFile: # Mixing reads and writes with this strawman # version would be a rather bad idea :) EOF = object() def __init__(self, f): self.f = f def read_all(self): self.f.seek(0) yield from self.f def append_lines(self): self.f.seek(0, 2) lines_written = 0 while 1: line = yield if line == self.EOF: break self.f.writeline(line) lines_written += 1 return lines_written The problem I see with the above is that with the current specification in the PEP, the read_all() implementation is outright broken rather than merely redundant (it is obviously wasteful, since it could just return self.f instead of yielding from it - but it is far from clear that it should be broken rather than just pointlessly slow). The first use of read_all() will implicitly close the file when it is finished - that seems totally nonobvious to me. It strikes me as simpler all round to leave the deterministic finalisation to the tool that was designed for the task, and let the new expression focus solely on correct delegation to subgenerators without worrying too much about other iterators. Sure, there are plenty of ways to avoid the implicit finalisation if you want to, but I'm still not convinced the "oh, you don't support throw() so I will fall back to close() instead" fallback behaviour is a particularly good idea. (It isn't a dealbreaker for me though - I still support the PEP overall, even though I'm -0 on this particular aspect of it). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
The part that isn't clicking for me is that I still don't understand *why* 'yield from' should include implicit finalisation as part of its definition.
It's the generalisation of that to all other iterators that happen to offer a close() method that seems somewhat arbitrary.
It's a matter of opinion. I would find it surprising if generators behaved differently from all other iterators in this respect. It would be un-ducktypish. I think we need a BDFL opinion to settle this one. -- Greg
Greg Ewing wrote:
Nick Coghlan wrote:
The part that isn't clicking for me is that I still don't understand *why* 'yield from' should include implicit finalisation as part of its definition.
It's the generalisation of that to all other iterators that happen to offer a close() method that seems somewhat arbitrary.
It's a matter of opinion. I would find it surprising if generators behaved differently from all other iterators in this respect. It would be un-ducktypish.
I think we need a BDFL opinion to settle this one.
It's still your PEP, so unless Guido objects to your preference, I'll cope - I suspect either approach can be explained easily enough in the documentation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Fri, Mar 27, 2009 at 6:58 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
The part that isn't clicking for me is that I still don't understand *why* 'yield from' should include implicit finalisation as part of its definition.
It's the generalisation of that to all other iterators that happen to offer a close() method that seems somewhat arbitrary.
It's a matter of opinion. I would find it surprising if generators behaved differently from all other iterators in this respect. It would be un-ducktypish.
I think we need a BDFL opinion to settle this one.
To be honest, I don't follow this in detail yet, but I believe I don't really care that much either way, and I'd like to recommend that you do whatever makes the specification (and hence hopefully the implementation) have the least special cases. There are several Python Zen rules about this. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Greg Ewing wrote:
Nick Coghlan wrote:
I'm not so sure about that - we don't do it for normal iteration, so why would we do it for the new expression?
Because of the inlining principle. If you inline a subgenerator, the result is just a single generator, and closing it finalizes the whole thing.
Since the main advantage that the new expression has over simple iteration is delegating send() and throw() correctly, and I'm suggesting that shared iterators and those two methods don't mix, perhaps this whole issue can be set aside?
Sounds good to me.
Just a thought... If the subgenerator does not interact with the generator it is in after it is started, then wouldn't it be as if it replaces the calling generator for the life of the sub generator? So instead of in-lining, can it be thought of more like switching-to another generator? Ron
On Mon, Mar 23, 2009 at 4:24 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
We have a decision to make. It appears we can have *one* of the following, but not both:
(1) In non-refcounting implementations, subiterators are finalized promptly when the delegating generator is explicitly closed.
(2) Subiterators are not prematurely finalized when other references to them exist.
Since in the majority of intended use cases the subiterator won't be shared, (1) seems like the more important guarantee to uphold. Does anyone disagree with that?
Guido, what do you think?
Gee, I'm actually glad I waited a while, because the following discussion shows that this is a really hairy issue... I think (1) means propagating GeneratorExit into the subgenerator (and recursively if that's also waiting in a yield-from), while (2) would mean not propagating it, right? I agree that (1) seems to make more sense unless you can think of a use case for (2) -- and it seems from Nick's last post that such a use case would have to be rather horrendously outrageous... -- --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (10)
-
Bruce Frederiksen
-
Greg Ewing
-
Guido van Rossum
-
Jacob Holm
-
Jim Jewett
-
Leif Walsh
-
Nick Coghlan
-
Ron Adam
-
Steven D'Aprano
-
Terry Reedy