PEP 479: Change StopIteration handling inside generators

PEP: 479 Title: Change StopIteration handling inside generators Version: $Revision$ Last-Modified: $Date$ Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 15-Nov-2014 Python-Version: 3.5 Post-History: 15-Nov-2014 Abstract ======== This PEP proposes a semantic change to ``StopIteration`` when raised inside a generator, unifying the behaviour of list comprehensions and generator expressions somewhat. Rationale ========= The interaction of generators and ``StopIteration`` is currently somewhat surprising, and can conceal obscure bugs. An unexpected exception should not result in subtly altered behaviour, but should cause a noisy and easily-debugged traceback. Currently, ``StopIteration`` can be absorbed by the generator construct. Proposal ======== If a ``StopIteration`` is about to bubble out of a generator frame, it is replaced with some other exception (maybe ``RuntimeError``, maybe a new custom ``Exception`` subclass, but *not* deriving from ``StopIteration``) which causes the ``next()`` call (which invoked the generator) to fail, passing that exception out. From then on it's just like any old exception. [3]_ Consequences to existing code ============================= This change will affect existing code that depends on ``StopIteration`` bubbling up. The pure Python reference implementation of ``groupby`` [1]_ currently has comments "Exit on ``StopIteration``" where it is expected that the exception will propagate and then be handled. This will be unusual, but not unknown, and such constructs will fail. (Nick Coghlan comments: """If you wanted to factor out a helper function that terminated the generator you'd have to do "return yield from helper()" rather than just "helper()".""") As this can break code, it is proposed to utilize the ``__future__`` mechanism to introduce this, finally making it standard in Python 3.6 or 3.7. Alternate proposals =================== Supplying a specific exception to raise on return ------------------------------------------------- Nick Coghlan suggested a means of providing a specific ``StopIteration`` instance to the generator; if any other instance of ``StopIteration`` is raised, it is an error, but if that particular one is raised, the generator has properly completed. Making return-triggered StopIterations obvious ---------------------------------------------- For certain situations, a simpler and fully backward-compatible solution may be sufficient: when a generator returns, instead of raising ``StopIteration``, it raises a specific subclass of ``StopIteration`` which can then be detected. If it is not that subclass, it is an escaping exception rather than a return statement. Criticism ========= Unofficial and apocryphal statistics suggest that this is seldom, if ever, a problem. [4]_ Code does exist which relies on the current behaviour, and there is the concern that this would be unnecessary code churn to achieve little or no gain. References ========== .. [1] Initial mailing list comment (https://mail.python.org/pipermail/python-ideas/2014-November/029906.html) .. [2] Pure Python implementation of groupby (https://docs.python.org/3/library/itertools.html#itertools.groupby) .. [3] Proposal by GvR (https://mail.python.org/pipermail/python-ideas/2014-November/029953.html) .. [4] Response by Steven D'Aprano (https://mail.python.org/pipermail/python-ideas/2014-November/029994.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

On 15 November 2014 19:29, Chris Angelico <rosuav@gmail.com> wrote:
Thanks for the write-up! Proposal
[snip]
I think you can skip mentioning this particular idea in the PEP - I didn't like it even when I posted it, and both of Guido's ideas are much better :)
There's an additional subtlety with this idea: if we add a new GeneratorReturn exception as a subclass of StopIteration, then generator iterators would likely also have to change to replace GeneratorReturn with a regular StopIteration (chaining appropriately via __cause__, and copying the return value across).
With such a change, we would actually likely modify the following code in contextlib._GeneratorContextManager.__exit__: try: self.gen.throw(exc_type, value, traceback) raise RuntimeError("generator didn't stop after throw()") except StopIteration as exc: # Generator suppressed the exception # unless it's a StopIteration instance we threw in return exc is not value except: if sys.exc_info()[1] is not value: raise To be the slightly more self-explanatory: try: self.gen.throw(type, value, traceback) raise RuntimeError("generator didn't stop after throw()") except GeneratorReturn: # Generator suppressed the exception return True except: if sys.exc_info()[1] is not value: raise The current proposal in the PEP actually doesn't let us simplify this contextlib code, but rather means we would have to make it more complicated to impedance match generator semantics with the context management protocol. To handle that change, we'd have to make the code something like the following (for clarity, I've assumed a new RuntimeError subclass, rather than RuntimeError itself): try: self.gen.throw(exc_type, value, traceback) raise RuntimeError("generator didn't stop after throw()") except StopIteration as exc: # Could becomes "return True" once the __future__ becomes the default return exc is not value except UnexpectedStopIteration as exc: if exc.__cause__ is not value: raise except: if sys.exc_info()[1] is not value: raise I definitely see value in adding a GeneratorReturn subclass to be able to tell the "returned" vs "raised StopIteration" cases apart from outside the generator (the current dance in contextlib only works because we have existing knowledge of the exact exception that was thrown in). I'm substantially less convinced of the benefit of changing generators to no longer suppress StopIteration. Yes, it's currently a rather odd corner case, but changing it *will* break code (at the very least, anyone using an old version of contextlib2, or who are otherwise relying on their own copy of contextlib rather than standard library one). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 16, 2014 at 1:13 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Doesn't hurt to have some rejected alternates there :)
Would have to do so automatically, meaning this is no simpler than the current proposal? Or would have to be always explicitly written to handle it?
This is why it's proposed to use __future__ to protect it. If anyone's still using an old version of contextlib2 once 3.7 comes along, it'll break; but is there any reason to use Python 3.7 with a contextlib from elsewhere than its standard library? (I'm not familiar with contextlib2 or what it offers.) ChrisA

On 16 November 2014 00:37, Chris Angelico <rosuav@gmail.com> wrote:
When GeneratorReturn escaped a generator frame, the interpreter would automatically convert it into an ordinary StopIteration instance. It's still simpler because it won't need the __future__ dance (as it doesn't involve any backwards incompatible changes).
Using __future__ still imposes a large cost on the community - docs need updating, code that relies on the existing behaviour has to be changed, developers need to adjust their mental models of how the language works. There needs to be a practical payoff for those costs - and at the moment, it's looking like we can actually get a reasonably large fraction of the gain without most of the pain by instead pursuing Guido's idea of a separate StopIteration subclass to distinguish returning from the outermost generator frame from raising StopIteration elsewhere in the generator.
Same reason folks use it now: consistent behaviour and features across a range of Python versions. However, that's not the key point - the key point is that working through the exact changes that would need to be made in contextlib persuaded me that I was wrong when I concluded that contextlib wouldn't be negatively affected. It's not much more complicated, but if we can find a fully supported example like that in the standard library, what other things might folks be doing with generators that *don't* fall into the category of "overly clever code that we don't mind breaking"?
(I'm not familiar with contextlib2 or what it offers.)
contexlib2 ~= 3.3 era contextlib that runs as far back as 2.6 (I initially created it as a proving ground for the idea that eventually become contextlib.ExitStack). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 16, 2014 at 2:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Okay, let me see if I have this straight. When a 'return' statement (including an implicit one at end-of-function) is encountered in any function which contains a 'yield' statement, it is implemented as "raise GeneratorReturn(value)" rather than as "raise StopIteration(value)" which is the current behaviour. However, if any GeneratorReturn would be raised in any way other than the 'return' statement, it would magically become a StopIteration instead. Is that correct? This does sound simpler. All the magic is in the boundary of the generator itself, nothing more. If a __next__ method raises either StopIteration or GeneratorReturn, or if any other function raises them, there's no special handling. Question: How does it "become" StopIteration? Is a new instance of StopIteration formed which copies in the other's ``value``? Is the type of this exception magically altered? Or is it a brand new exception with the __cause__ or __context__ set to carry the original?
Fair enough. The breakage is a known problem, though; whatever's done is likely to cause at least some issues. If the alternate you describe above will break less (or almost none), then it'll be the best option.
Thanks, I figured it'd be like that. Since contextlib exists in 2.7, is contextlib2 meant to be legacy support only? ChrisA

On 16 November 2014 01:56, Chris Angelico <rosuav@gmail.com> wrote:
That's not quite how generators work. While the "returning from a generator is equivalent to raise StopIteration" model is close enough that it's functionally equivalent to the actual behaviour in most cases (with the main difference being in how try/except blocks and context managers inside the generator react), this particular PEP is a situation where it's important to have a clear picture of the underlying details. When you have a generator iterator (the thing you get back when calling a generator function), there are two key components: * the generator iterator object itself * the generator frame where the code is running When you call next(gi), you're invoking the __next__ method on the *generator iterator*. It's that method which restarts evaluation of the generator frame at the point where it last left off, and interprets any results. Now, there are three things that can happen as a result of that frame evaluation: 1. It hits a yield point. In that case, gi.__next__ returns the yielded value. 2. It can return from the frame. In that case. gi.__next__ creates a *new* StopIteration instance (with an appropriate return value set) and raises it 3. It can throw an exception. In that case, gi.__next__ just allows it to propagate out (including if it's StopIteration) The following example illustrates the difference between cases 2 and 3 (in both cases, there's a StopIteration that terminates the hidden loop inside the list() call, the difference is in where that StopIteration is raised):
(The possible outcomes of gi.send() and gi.throw() are the same as those of next(gi). gi.throw() has the novel variant where the exception thrown in may propagate back out) The two change proposals being discussed are as follows: Current PEP (backwards incompatible): Change outcome 3 to convert StopIteration to RuntimeError (or a new exception type). Nothing else changes. Alternative (backwards compatible): Change outcome 2 to raise GeneratorReturn instead of StopIteration and outcome 3 to convert GeneratorReturn to StopIteration. The alternative *doesn't* do anything about the odd discrepancy between comprehensions and generator expressions that started the previous thread. It just adds a new capability where code that knows it's specifically dealing with a generator (like contextlib or asyncio) can more easily tell the difference between outcomes 2 and 3.
All the magic is actually at the generator boundary regardless. The key differences between the two proposals are the decision to keep StopIteration as a common parent exception, and allow it to continue propagating out of generator frames unmodified.
I'd suggest used the exception chaining machinery and creating a new exception with __cause__ and the generator return value set appropriately.
contextlib has actually been around since 2.5, but some features (most notably ExitStack) weren't added until much later. Like unittest2, contextlib2 allows access to newer stdlib features on older versions (I haven't used it as a testing ground for new ideas since ExitStack). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 16, 2014 at 3:51 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thank you for explaining. -- Cameron In case others were also oversimplifying in their heads, I've summarized the above into the PEP.
Should that variant affect this proposal? What should happen if you throw StopIteration or GeneratorReturn into a generator?
Text along these lines added to PEP, thanks!
Makes sense. If the __cause__ is noticed at all (ie this doesn't just quietly stop a loop), it wants to be very noisy.
If there is breakage from this, it would simply mean "older versions of contextlib2 are not compatible with Python 3.7, please upgrade your contextlib2" - several of the variants make it perfectly possible to write cross-version-compatible code. I would hope that this remains the case. ChrisA

On Sun, Nov 16, 2014 at 3:51 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thank you for explaining. -- Cameron In case others were also oversimplifying in their heads, I've summarized the above into the PEP.
Should that variant affect this proposal? What should happen if you throw StopIteration or GeneratorReturn into a generator?
Text along these lines added to PEP, thanks!
Makes sense. If the __cause__ is noticed at all (ie this doesn't just quietly stop a loop), it wants to be very noisy.
If there is breakage from this, it would simply mean "older versions of contextlib2 are not compatible with Python 3.7, please upgrade your contextlib2" - several of the variants make it perfectly possible to write cross-version-compatible code. I would hope that this remains the case. Latest version of PEP text incorporating the above changes: https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-0479.txt (My apologies if this email has gone through more than once. I'm having major issues with my internet connection at the moment, and delivery is failing and being retried. Hopefully it really *is* failing, and not just saying so.) ChrisA

Since this changes the behavior of an object instance, how can __future__ help? If the generator definition is in a library but the code that raises StopIteration to terminate it is passed in from the users code, how is the user supposed to select the behavior they want? (This sounds to me like a similar problem to adding 'from __future__ import py3_string' to Py2, which we discussed a while ago. Happy to be shown that it isn't.) Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Chris Angelico<mailto:rosuav@gmail.com> Sent: 11/15/2014 1:30 To: python-ideas<mailto:python-ideas@python.org> Subject: [Python-ideas] PEP 479: Change StopIteration handling inside generators PEP: 479 Title: Change StopIteration handling inside generators Version: $Revision$ Last-Modified: $Date$ Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 15-Nov-2014 Python-Version: 3.5 Post-History: 15-Nov-2014 Abstract ======== This PEP proposes a semantic change to ``StopIteration`` when raised inside a generator, unifying the behaviour of list comprehensions and generator expressions somewhat. Rationale ========= The interaction of generators and ``StopIteration`` is currently somewhat surprising, and can conceal obscure bugs. An unexpected exception should not result in subtly altered behaviour, but should cause a noisy and easily-debugged traceback. Currently, ``StopIteration`` can be absorbed by the generator construct. Proposal ======== If a ``StopIteration`` is about to bubble out of a generator frame, it is replaced with some other exception (maybe ``RuntimeError``, maybe a new custom ``Exception`` subclass, but *not* deriving from ``StopIteration``) which causes the ``next()`` call (which invoked the generator) to fail, passing that exception out. From then on it's just like any old exception. [3]_ Consequences to existing code ============================= This change will affect existing code that depends on ``StopIteration`` bubbling up. The pure Python reference implementation of ``groupby`` [1]_ currently has comments "Exit on ``StopIteration``" where it is expected that the exception will propagate and then be handled. This will be unusual, but not unknown, and such constructs will fail. (Nick Coghlan comments: """If you wanted to factor out a helper function that terminated the generator you'd have to do "return yield from helper()" rather than just "helper()".""") As this can break code, it is proposed to utilize the ``__future__`` mechanism to introduce this, finally making it standard in Python 3.6 or 3.7. Alternate proposals =================== Supplying a specific exception to raise on return ------------------------------------------------- Nick Coghlan suggested a means of providing a specific ``StopIteration`` instance to the generator; if any other instance of ``StopIteration`` is raised, it is an error, but if that particular one is raised, the generator has properly completed. Making return-triggered StopIterations obvious ---------------------------------------------- For certain situations, a simpler and fully backward-compatible solution may be sufficient: when a generator returns, instead of raising ``StopIteration``, it raises a specific subclass of ``StopIteration`` which can then be detected. If it is not that subclass, it is an escaping exception rather than a return statement. Criticism ========= Unofficial and apocryphal statistics suggest that this is seldom, if ever, a problem. [4]_ Code does exist which relies on the current behaviour, and there is the concern that this would be unnecessary code churn to achieve little or no gain. References ========== .. [1] Initial mailing list comment (https://mail.python.org/pipermail/python-ideas/2014-November/029906.html) .. [2] Pure Python implementation of groupby (https://docs.python.org/3/library/itertools.html#itertools.groupby) .. [3] Proposal by GvR (https://mail.python.org/pipermail/python-ideas/2014-November/029953.html) .. [4] Response by Steven D'Aprano (https://mail.python.org/pipermail/python-ideas/2014-November/029994.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Exactly! The current behavior is not only likely undesirable, but it is also undocumented. Even if parts of stdlib rely on the current behavior, there's no need for a deprecation (read __future__) period. Undocumented features may change any time, because are mostly about implementation quirks (Isn't that rule documented somewhere in the Python docs?). In short: -1 deprecation (__future__); no need, because nothing documented gets broken +1 fix it now (3.5); the fix may be a change in the docs to validate the current behavior, and deprecate it (Yuk!) +1 Nick's design, which kind of leaves it the same and kind of fixes it p.s. What about 2.7? This fix is *not* a new feature. Cheers, -- Juanca On Sat Nov 15 2014 at 1:50:06 PM Steve Dower <Steve.Dower@microsoft.com> wrote:

On Sun, Nov 16, 2014 at 8:00 AM, Juancarlo Añez <apalala@gmail.com> wrote:
I'm not sure about that. As Steven said, the current behaviour is simple: 1) When 'yield' is reached, a value is yielded. 2) When 'return' is reached, StopIteration is raised. 3) When an exception is raised, it is permitted to bubble up. Whether that is *correct* or not is the point of this PEP, but it is at least simple, and while it may not be documented per se, changing it is likely to break code.
Maybe not, but let's get the proposal settled before figuring out how much deprecation period is needed.
That would be pretty much what happens if the PEP is rejected: the current behaviour will be effectively validated (at least to the extent of "it's not worth the breakage").
p.s. What about 2.7? This fix is *not* a new feature.
That ultimately depends on the release manager, but I would not aim this at 2.7. Nick's proposal introduces a new exception type, which I think cuts this out of 2.7 consideration right there; both active proposals involve distinct changes to behaviour. I believe both of them require *at a minimum* a feature release, and quite probably a deprecation period (although that part may be arguable, as mentioned above). ChrisA

On Sun, Nov 16, 2014 at 5:20 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
The behaviour selection would have to be based on the generator's definition. This proposal, in all its variants, is about what happens as the generator terminates; if you call on someone else's generator, and that someone hasn't applied the __future__ directive, you'll be in the current situation of not being able to distinguish 'return' from 'raise StopIteration'. But for your own generators, you can guarantee that they're distinct. ChrisA

On Mon, Nov 17, 2014 at 11:05:01AM +1300, Greg Ewing wrote:
I don't see how that is different from any other __future__ directive. They are all per-module, and if you gain access to an object from another module, it will behave as specified in the module that created it, not the module that imported it. How is this different? -- Steven

On Mon, Nov 17, 2014 at 11:03 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Well, let's see. For feature in sorted(__future__.all_feature_names): absolute_import: Affects implementation of a keyword barry_as_FLUFL: Not entirely sure what this one actually accomplishes. :) division: Changes the meaning of one operator. generators: Introduces a keyword nested_scopes: Alters the compilation of source to byte-code(?) print_function: Removes a keyword unicode_literals: Alters the type used for literals with_statement: Introduces a keyword Apart from the joke, it seems that every __future__ directive is there to affect the compilation, not execution, of its module: that is, once a module has been compiled to .pyc, it shouldn't matter whether it used __future__ or not. Regardless of unicode_literals, you can create bytes literals with b'asdf' and unicode literals with u'asdf'. I'm not entirely sure about division (can you call on true-division without the future directive?), but in any case, it's all done at compilation time, as can be seen interactively: Python 2.7.3 (default, Mar 13 2014, 11:03:55) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.
So to make this consistent with all other __future__ directives, there would need to be some kind of safe way to define this: perhaps an attribute on the generator object. Something like this:
The attribute on the function would be what affects behaviour; the __future__ directive applies that attribute to all generator functions in its module (including genexprs). Once the __future__ directive becomes automatic, the attribute can and will be dropped - any code which interrogates it MUST be prepared to stop interrogating it once the feature applies to all modules. Does that sound reasonable? Should it be added to the PEP? ChrisA

(I'm catching up on this thread from the end.) On Sun, Nov 16, 2014 at 5:29 PM, Chris Angelico <rosuav@gmail.com> wrote:
I agree with you and Steven that this is a fine use of __future__. What a generator does with a StopIteration that is about to bubble out of its frame is up to that generator. I don't think it needs to be a flag on the *function* though -- IMO it should be a flag on the code object. (And the flag should somehow be transferred to the stack frame when the function is executed, so the right action can be taken when an exception is about to bubble out of that frame.) One other small point: let's change the PEP to just propose RuntimeError, and move the "some other exception" to the "rejected ideas" section. -- --Guido van Rossum (python.org/~guido)

On Mon, Nov 17, 2014 at 12:58 PM, Guido van Rossum <guido@python.org> wrote:
Changes incorporated, thanks! I'm not familiar with the details of stack frame handling, so I've taken the cop-out approach and just quoted you directly into the PEP. PEP draft: https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-0479.txt GitHub hosted repo, if you want to follow changes etc: https://github.com/Rosuav/GenStopIter ChrisA

On Mon, Nov 17, 2014 at 8:26 PM, Georg Brandl <g.brandl@gmx.net> wrote:
Thanks Georg! This means today's version is now visible here: http://legacy.python.org/dev/peps/pep-0479/ ChrisA

On Mon, Nov 17, 2014 at 2:27 AM, Chris Angelico <rosuav@gmail.com> wrote:
Off-topic: the new python.org site now supports PEPs, so please switch to URLs like this: https://www.python.org/dev/peps/pep-0479/ (if you don't like the formatting send a pull request to https://github.com/python/pythondotorg). -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 18, 2014 at 4:28 AM, Guido van Rossum <guido@python.org> wrote:
Oh, nice. Google searches for PEPs still find legacy rather than that, so it may be worth setting legacy to redirect to www. Formatting looks fine, except that the subheadings look bolder than the main; I'll check for issues and post one, though probably not a PR. ChrisA

On 17/11/2014 2:29 p.m., Chris Angelico wrote:
barry_as_FLUFL: Not entirely sure what this one actually accomplishes. :)
It determines whether "not equal" is spelled "!=" or "<>", so it fits the pattern of being compile-time-only. -- Greg

On 17 November 2014 13:34, Chris Angelico <rosuav@gmail.com> wrote:
True division (in Python 2) is a nice simple one to look at, since it just swaps one bytecode for another (BINARY_DIVIDE -> BINARY_TRUE_DIVIDE)
The compiler actually stores a whole pile of useful info on code objects that doesn't show up in the disassembly output (switching to Python 3 for more up to date dis module goodness):
So conveying to the generator iterator whether or not "from __future__ import generator_return" was in effect would just be a matter of the compiler setting a new flag on the generator code object. For *affected generators* (i.e. those defined in a module where the new future statement was in effect), StopIteration escaping would be considered a RuntimeError. For almost all code, such RuntimeErrors would look like any other RuntimError raised by a broken generator implementation. The only code which would *have* to change immediately as a "Porting to Python 3.5" requirement is code like that in contextlib, which throws StopIteration into generators, and currently expects to get it back out unmodified. Such code will need to be updated to also handle RuntimError instances where the direct cause is the StopIteration exception that was thrown in. Other affected code (such as the "next() bubbling up" groupby example) would keep working unless the __future__ statement was in effect. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Nov 17, 2014 at 4:56 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thanks, this is exactly what I was thinking of. The new flag could be named REPLACE_STOPITERATION. Then the __future__ import could be named replace_stopiteration_in_generators (it needs more description than the flag name because the flag is already defined in the context of a generator, while the __future__ import must still establish that context). -- --Guido van Rossum (python.org/~guido)

On Sun, Nov 16, 2014 at 11:49 AM, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
Agreed. And agreed on the analysis; I can't add examples till I know for sure what I'm adding examples _of_. The latest edit expanded on the details of the proposals, so it now may be possible to consider examples, but possibly we're still bikeshedding the nature of the proposals themselves. Correction. We're DEFINITELY still bikeshedding etc etc, but possibly we're still doing so to the extent that it's not worth adding examples yet. :) ChrisA

On Sat, Nov 15, 2014 at 1:29 AM, Chris Angelico <rosuav@gmail.com> wrote:
Specifically it's absorbed by the caller of the generator, because the caller doesn't know the difference between next(x) raising StopIteration because the iterator specifically wants to stop, vs because of accident. As another alternative, how about a new iterator protocol that is defined without this ambiguity? Code at the bottom of my post to help explain: define a new method __nextx__ which doesn't use StopIteration for any signalling, instead, it returns None if there are no values to return, and returns a special value Some(v) if it wants to return a value v. Both next(it) and nextx(it) are made to work for any iterator that is defined using either protocol, but for loops and Python builtins all use nextx internally. Generators define __next__ unless you from __future__ import iterators, in which case they define __nextx__ instead. In this way, old code can't tell the difference between accidental StopIteration and deliberate StopIteration, but new code (using nextx instead of next, and using __future__ import'd generators) can. No backwards incompatibility is introduced, and you can still insert StopIteration into a generator and get it back out -- using both next() where it is ambiguous and nextx() where it is not. Yes, it's ugly to have two different iterator protocols, but not that ugly. In fact, this would be Python's third (I have omitted that third protocol in the below example, for the sake of clarity). I find the proposed solution more scary, in that it's sort of a "hack" to get around an old mistake, rather than a correction to that mistake, and it introduces complexity that can't be removed in principle. (Also, it's very unusual.) class Some: def __init__(self, value): self.value = value def next(it): v = nextx(it) if v is None: raise StopIteration return v.value def nextx(it): if hasattr(it, '__nextx__'): v = it.__nextx__() if v is None or isinstance(v, Some): return v raise TypeError("__nextx__ must return Some(...) or None, not %r" % (v,)) if hasattr(it, '__next__'): try: return Some(it.__next__()) except StopIteration: return None raise TypeError -- Devin

On Tue, Nov 18, 2014 at 12:50 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
I had actually contemplated adding a "what if __next__ returned a sentinel instead of raising an exception" possibility to the PEP, if only for completeness. Since someone else has suggested it too now, it may be worth doing. Rather than a wrapper around every returned value, what I'd be inclined toward is a special sentinel that otherwise cannot be returned. This could be a dedicated, arbitrary object, or something globally unique, or something locally unique. One option that comes to mind is to have the generator return _itself_ to signal that it's returned. I don't think this option will be better than the current front runners, but would you like me to add it for completeness? The biggest downside is that it might give a false positive; you can't, for instance, have an iterator "all_objects()" which returns, like the name says, every object currently known to Python. (I don't know that CPython is capable of implementing that, but there's no reason another Python couldn't, and it might be useful.) I expect that's why the exception system was used instead; can anyone confirm that? ChrisA

I don't want to contemplate a new __next__ protocol. The existing protocol was carefully designed and tuned to have minimal memory overhead (even to the point where the exception instance returned may be reused). Wapping each result would just result in an extra allocation + deallocation per iteration, unless you can play games with reference counts or do something else to complicate the semantics). Introducing __nextx__ would require thousands of libraries implementing this to incur churn as they feel the pressure to switch to the new protocol, and the compatibility issue would be felt everywhere. The problem we're trying to fix is unique to generators (thereby also implicating generator expressions). On Mon, Nov 17, 2014 at 6:04 AM, Chris Angelico <rosuav@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Mon, Nov 17, 2014 at 9:40 AM, Guido van Rossum <guido@python.org> wrote:
This sounds totally reasonable to me.
The problem we're trying to fix is unique to generators (thereby also implicating generator expressions).
I suppose since you're only fixing generators, then that is literally the only problem you are trying to fix, but it is more general than that. I have encountered this sort of problem writing __next__ by hand in Python -- that is, that bugs inside code I call result in silent control flow changes rather than a visible exception. -- Devin

On Mon, Nov 17, 2014 at 9:53 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote: [Guido]
I assume this is something where the __next__() method on your iterator class calls next() on some other iterator and accidentally doesn't catch the StopIteration coming out of it (or, more likely, this happens several calls deep, making it more interesting to debug). That particular problem is not unique to __next__ and StopIteration -- the same thing can (and does!) happen with __getitem__ and KeyError or IndexError, and with __getattr[ibute]__ and AttributeError. In all these cases I think there isn't much we can do apart from adding lint rules. If you are writing __next__ as a method on an iterator class, one way or another you are going to have to raise StopIteration when there isn't another element, and similarly __getitem__ has to raise KeyError or IndexError, etc. In the generator case, we have a better way to signal the end -- a return statement (or falling off the end). And that's why we can even contemplate doing something different when StopIteration is raised in the generator. -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 18, 2014 at 4:53 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
If you're writing __next__ by hand, there's nothing anyone else can do about a bubbled-up StopIteration. What you can do is wrap your code in try/except: def __next__(self): try: # whatever except StopIteration: raise RuntimeError raise StopIteration If your "whatever" section returns a value, that's what the result will be. If it fails to return a value, StopIteration will be raised. And if StopIteration is raised, it'll become RuntimeError. But this has to be inside __next__. This can't be done externally. (Note, untested code. May have bugs.) ChrisA

On Tue, Nov 18, 2014 at 4:40 AM, Guido van Rossum <guido@python.org> wrote:
Wapping each result would just result in an extra allocation + deallocation per iteration...
Which is why I would be more inclined to use a sentinel of some sort... but that has its own problems. There's no perfect solution, so status quo wins unless a really compelling case can be made. I could toss something into the Alternate Proposals section, but I wouldn't be personally supporting it. ChrisA

On Mon, Nov 17, 2014 at 10:09 AM, Chris Angelico <rosuav@gmail.com> wrote:
Trust me, we went down this rabbit hole when we designed the iterator protocol. The problem is that that sentinel object must have a name (otherwise how would you know when you had seen the sentinel), which means that there is at least one dict (the namespace defining that name, probably the builtins module) that has the sentinel as one of its values, which means that iterating over that particular dict's values would see the sentinel as a legitimate value (and terminate prematurely). You could fix this by allocating a unique sentinel for every iteration, but there would be some additional overhead for that too (the caller and the iterator both need to hold on to the sentinel object). In any case, as I tried to say before, redesigning the iterator protocol is most definitely out of scope here (you could write a separate PEP and I'd reject it instantly). -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 18, 2014 at 5:17 AM, Guido van Rossum <guido@python.org> wrote:
That's a much more solid argument against it than I had. (Also amusing to try to contemplate.) The iterator protocol absolutely demands a completely out-of-band means of signalling "I have nothing to return now".
That's a PEP I'll leave for someone else to write. :) ChrisA

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/17/2014 10:23 AM, Chris Angelico wrote:
On the other hand, if you did write it (now), and Guido rejected it (of course), then that would mean PEP 479 is sure to be accepted! Third time's the charm! ;) - -- ~Ethan~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJUaj3uAAoJENZ7D1rrH75NzRYP/2NcwwNOQvD71RlhJG0k0sUY xLgjzLSCVQbR1jliF1Ej9BEZhFhd1I4lg/msjDPNBSHABPtsbrRTzZiac4x3OAEz 5L5qCB7hIaHg+gDKEbCiOl5NAZUbZL+Qs1tFdlSjeCti/YUAVE1DiD4WQ+kXHxZ4 OF9Nj5urHrGQYh//HR3RryQddihPBkM9Y40oucRwQE32jzcJacgmEvOM4aVvJHb8 2rBfl7VjpHwC2dV51R3cYpwIu2Qqk9i4kEnkB9xDtj9HQl3YT+qEyAuZWOs7v3cH cJOjBuIxJGYF/CbjRE5/mCFegvGv4Lf2C8QSeKlqR549IPhso+qcnLvTQNdBWer0 g9L8Gsk1whUR9GQ3HGaupAkAo8rbZKM9fjvEtnPvcWAPcGZDfeFG5MEl82SCOsLS aUl+v5u2sVozxIsY9fr8s30X2HZReaVjuJFDLCosZmi3RcqYL7O0x5+Mo4GPgsbi YxyCYU0c0LTTk7yljweYL/oWXyUS/hAxz2VNMyAjDNw1Uy1kWwS6qAW5FcMxfdlz I92CBx0PXPLz1hhHYsJ/YnHIJpL9+HRHHRRdJOnHOGCyb7AX6jN3ruMGYaTClrYa G2ez3JMeu0NhTuzpFF2mjO0PXTu/HuHq3bdv44jhZS0Zn67OQmNk845bZi5U3EaW oJTRKMM/PTZNpspQE0/n =ZYwx -----END PGP SIGNATURE-----

FWIW, I spent some time this morning close-reading the PEP, and made a somewhat significant set of updates -- adding more specifics about the proposed new __futue__ statement and the new code object flag, tracking down a few more examples of code that would be affected, and some other minor edits. Here's the diff: https://hg.python.org/peps/rev/8de949863677 Hopefully the new version will soon be here: https://www.python.org/dev/peps/pep-0479 Note that I am definitely not yet deciding on this PEP. I would love it if people sent in examples of code using generator expressions that would be affected by this change (either by highlighting a bug in the code or by breaking what currently works). If this PEP gets rejected, we could resurrect the GeneratorExit proposal currently listed as an alternative -- although the more I think about that the less I think it's worth it, except for the very specific case of asyncio (thinking of which, I should add something to the PEP about that too). -- --Guido van Rossum (python.org/~guido)

On Mon, Nov 17, 2014 at 11:38:27AM -0800, Guido van Rossum wrote:
Over a week ago I raised this issue on python-list mailing list. I expected a storm of bike-shedding, because that's the sort of place p-l is :-) but got just two people commenting. The thread, for anyone interested: https://mail.python.org/pipermail/python-list/2014-November/680757.html One response suggested that it is not generators which do the wrong thing, but comprehensions, and that comprehensions should be changed to behave like generators: https://mail.python.org/pipermail/python-list/2014-November/680758.html That should probably be put in the PEP, even if it is not an option being considered, it at least evidence that "some people" find the behaviour of generators more natural than that of comprehensions. -- Steve

Nick, I think we've gone through enough clarifications of the PEP now to be clear on the proposal. I saw in one of your earliest replies (right after Chris posted his first draft) that you're hesitant to support the PEP because of what would have to change to contextlib. What I couldn't quite read is whether you think that the proposal by itself is not an improvement, or whether you're just worried about compatibility. Apparently you know of a large group of users who use an older 3rd party version of contextlib, and for whom that older, 3rd party contextlib should keep working with future versions of Python 3 without updating their version of contextlib -- did I get that right? What exactly is the constraint there that makes their version of contextlib immutable even though the version of Python they are using may move forward? Separate from this special case, I am also worried about backward compatibility, and I have yet to get a good idea for how widespread code is that depends on StopIteration bubbling out from generators. I also don't have a good idea how often this issue bites users, but I have a feeling it does bite. E.g. this quote from c.l.py ( https://mail.python.org/pipermail/python-list/2014-November/680775.html): """ I did find it annoying occasionally that raising StopIteration inside a generator expression conveys a different behavior than elsewhere. It did take me quite a while to understand why that is so, but after that it did not cause me much of a headache anymore. """ -- --Guido van Rossum (python.org/~guido)

On 11/18/2014 05:50 AM, Guido van Rossum wrote:
I just remembered one use of the current behavior. Two years ago or so, I was suggesting on this list a possibility for early termination of comprehensions when a particular value is encountered. In other words, an equivalent to: l = [] for x in seq: if x == y: break l.append(x) At the time, somebody suggested (roughly): def stop(): raise StopIteration l = list(x for x in seq if x!=y or stop()) which, for the very reasons discussed in this thread, works only as a generator expression and not in comprehension form. I used this solution in some not particularly important piece of code so I wouldn't despair if it wasn't compatible with the next release of the language. Also, I have a feeling that some of you may consider this sort of a hack in the first place. Just thought I'd mention it here for completeness. Wolfgang

On 11/18/2014 12:40 PM, Wolfgang Maier wrote:
I believe I thought then that one should write the explicit loop rather than overload the 'comprehension' concept.
If stop is defined in another file, such as 'utility', this is a bit nasty. A new maintainer comes along and changes that to a list comprehension, or perhaps decides a set rather than a list is needed, and changes it to a set comprehension instead of set() call and bingo!, a bug. Or someone imitates the pattern, but with [] instead of list.
which, for the very reasons discussed in this thread, works only as a generator expression and not in comprehension form.
With this example, where the StopIteration source could be much more obscure than next(), I now understand Guido's concern about hard-to-understand bugs. From a maintainability view, it should not matter if one calls a function on a naked comprehension (making it a genexp) or uses the
-- Terry Jan Reedy

On Wed, Nov 19, 2014 at 9:54 AM, Terry Reedy <tjreedy@udel.edu> wrote:
I'm not sure about that. Comprehensions can already be filtered; is it such a jump from there to a "filter" that aborts on a certain condition? It may not be language-supported, but I don't see that it's illogical; and any use of a loop that appends to a list is rightly considered code smell. ChrisA

On Wed, Nov 19, 2014 at 10:08:29AM +1100, Chris Angelico wrote:
It certainly isn't. It's an obvious extension to the concept: terminate the loop rather than filter it. At least two languages support early termination: http://clojuredocs.org/clojure_core/clojure.core/for http://docs.racket-lang.org/guide/for.html and it keeps getting asked for: http://www.reddit.com/r/Python/comments/ciec3/is_there_anything_like_a_list_... http://stackoverflow.com/questions/5505891/using-while-in-list-comprehension... http://stackoverflow.com/questions/16931214/short-circuiting-list-comprehens... https://www.daniweb.com/software-development/python/threads/293381/break-a-l... https://mail.python.org/pipermail/python-ideas/2014-February/026036.html https://mail.python.org/pipermail/python-ideas/2013-January/018969.html There's a rejected PEP: https://www.python.org/dev/peps/pep-3142/ and alternative solutions (write an explicit generator function, use itertools.takewhile). So there's obviously a need for this sort of thing, and (expr for x in iterable if cond() or stop()) seems to be a common solution. I'm not sure if that's a neat trick or a deplorable hack :-) but either way this PEP will break code using it.
and any use of a loop that appends to a list is rightly considered code smell.
I'm afraid I don't understand that comment. Why is appending to a list inside a loop a code smell? That's exactly what list comps do. -- Steven

On Wed, Nov 19, 2014 at 12:15 PM, Steven D'Aprano <steve@pearwood.info> wrote:
That's precisely why. If I write code like this: l = [] for i in something: l.append(func(i)) then I should rework it into a comprehension. Having a filter doesn't change that: l = [] for i in something: if i: l.append(func(i)) That's still possible with a list comp, and should be rewritten as one. But having a break in there *does* change it, because there's no way in the language to do that. The question is: Is it better to abuse StopIteration or to turn the list comp back into an explicit loop? And if anyone chose the former, their code will break. ChrisA

On Tue, Nov 18, 2014 at 6:01 PM, Chris Angelico <rosuav@gmail.com> wrote:
Not everything you do with an explicit loop can be done with a comprehension, and that's by design. Comprehensions should be easier to reason about than code using for-loops. And generator expressions should work the same way, except for producing results in a lazy fashion. The StopIteration hack breaks this equivalence and hampers the ability to reason, since you can't tell whether a predicate might raise StopIteration. It was never my intention that generator expressions behaved this way -- it was an accidental feature that surprised me when it was first shown to me, and I've never gotten used to it. (And I don't care whether you say it is "obvious", call it "stop()", and only use it in an "idiomatic" fashion -- it's still a surprise for anyone who has to debug code involving it.) The only thing standing in the way of fixing this is the recognition that there may be a fair amount of code out there that depends on this hack, and which will have to be rewritten. -- --Guido van Rossum (python.org/~guido)

On Wed, Nov 19, 2014 at 1:56 PM, Guido van Rossum <guido@python.org> wrote:
Has anyone come across any more non-trivial examples? We have contextlib (in the standard library) and contextlib2 (third-party), plus a number of StackOverflow posts and such. Are there any other known pieces of code that would be seriously hampered by this change? ChrisA

On Tue, Nov 18, 2014 at 7:37 PM, Chris Angelico <rosuav@gmail.com> wrote:
One possible way to find out would be to write a simple version of a patch (maybe one that doesn't use __future__ but just always converts StopIteration to RuntimeError when it bubbling out of a generator frame) and run the stdlib tests, then see how many tests this breaks. (I understand if you don't want to write it. But maybe someone does. :-) -- --Guido van Rossum (python.org/~guido)

On Wed, Nov 19, 2014 at 3:22 PM, Guido van Rossum <guido@python.org> wrote:
I poked around a bit in the code and managed to come up with this. It doesn't chain the previous exception, so the traceback is a little scanty, but it does turn a StopIteration into a RuntimeError. (It might also leak the original StopIteration. I'm not sure.) Prior to this patch, I had 377 of 390 tests passing flawlessly and no failures (just skips and warnings); with this applied, six failures. diff -r 23ab1197df0b Objects/genobject.c --- a/Objects/genobject.c Wed Nov 19 13:21:40 2014 +0200 +++ b/Objects/genobject.c Thu Nov 20 13:43:44 2014 +1100 @@ -130,6 +130,14 @@ } Py_CLEAR(result); } + else if (!result) + { + if (PyErr_ExceptionMatches(PyExc_StopIteration)) + { + PyErr_SetString(PyExc_RuntimeError, + "generator raised StopIteration"); + } + } if (!result || f->f_stacktop == NULL) { /* generator can't be rerun, so release the frame */ However, I'm not sure about setting the context. In errors.c is a function _PyErr_ChainExceptions which appears to do a similar job, so I imitated its code. Here's the result: else if (!result) { if (PyErr_ExceptionMatches(PyExc_StopIteration)) { PyObject *exc, *val, *val2, *tb; PyErr_Fetch(&exc, &val, &tb); PyErr_NormalizeException(&exc, &val, &tb); Py_DECREF(exc); Py_XDECREF(tb); PyErr_SetString(PyExc_RuntimeError, "generator raised StopIteration"); PyErr_Fetch(&exc, &val2, &tb); PyErr_NormalizeException(&exc, &val2, &tb); PyException_SetContext(val2, val); PyErr_Restore(exc, val2, tb); } } The context is being set, but without a traceback. ############# def good_gen(): yield 1 return 2 def evil_gen(): yield 1 raise StopIteration(2) # In absence of PEP 479 changes, the above two should be virtually indistinguishable. print("Starting.") good = tuple(good_gen()) print("Good:", good, good == (1,)) evil = tuple(evil_gen()) print("Evil:", evil, evil == (1,)) ############# Starting. Good: (1,) True StopIteration: 2 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "../test_pep0479.py", line 14, in <module> evil = tuple(evil_gen()) RuntimeError: generator raised StopIteration What am I missing here? Do I need to force something to construct a full traceback before it can show the line number that actually raised StopIteration? ChrisA

On Thu, Nov 20, 2014 at 2:44 PM, Chris Angelico <rosuav@gmail.com> wrote:
With the attached demo patch, all tests pass except test_generators, which explicitly tests stuff about the correlation between return and StopIteration. There's the contextlib changes, a couple of places that were raising StopIteration and should be returning, and a couple that were letting StopIteration bubble and now need to catch it and return. I've deliberately not followed PEP 8 here, in the interests of minimizing diff size; in several cases, blocks of code ought to be indented a level, but I cheated and created a half-indentation to show how little actually changes. If anyone would care to try this on their own codebases, that'd be helpful. ChrisA

On Thu, Nov 20, 2014 at 9:13 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 11/17/2014 08:50 PM, Guido van Rossum wrote:
One argument for making the change*: When we're writing __next__, or __getattr__, etc., it is obvious that we are playing with internals and have to be extra careful of what other exceptions might be raised in that code. Contrariwise, the only indication of something special about a generator is the presence of the yield keyword -- for ordinary use (such as in for loops) it doesn't matter whether the called function returns a list, tuple, iterator, generator, or whatever, as long as it can be iterated over, and so when writing a generator, or converting an iterable-returning function into a generator, there's nothing obvious saying, "Hey! Watch out for a StopIteration somewhere else in this block of code!" * I make no statement as to how strong this argument is, but there you have it. :) -- ~Ethan~

On 18 November 2014 14:50, Guido van Rossum <guido@python.org> wrote:
I think it's an improvement - I really like the fact that it brings generators into line with your reasoning in the with statement PEP that flow control constructs should be locally visible. At the moment, "raise StopIteration" in a generator context is effectively non-local flow control, as it means any function call (explicit or implicit) or yield point may gracefully stop generator execution, rather than only return statements. the StopIteration instance you threw in. You can construct scenarios where such a check will give a false positive, but they're getting seriously contrived at that point. That's obscure enough that I think it's on par with other behavioural tweaks we've included in the "Porting to Python X.Y" guides in the past.
I don't even remember how that came up now, but it was entirely hypothetical, and I think it can be ignored as a concern. As you say, being able to update to Python 3.5 without also being able to update to a new version of contextlib2 would just be weird. Even if such a strange scenario did somehow come up, it would still be possible to switch to a conditional import where they used the stdlib version if available, and only fell back to contextlib2 on earlier versions of Python.
One advantage of needing a __future__ import on the generator author side is that you always have the option of changing your mind, and *never* making the new behaviour the default. That wouldn't be a *good* outcome, but I don't think it would be intolerable. OTOH, I'm also not sure the status quo is sufficiently problematic to be worth changing. Yes, it's a little weird, but is it *that* much weirder than the unavoidable issues with exceptions thrown in __next__, __getitem__, __getattr__ and other special methods where a particular type of exception is handled directly by the interpreter? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Nov 20, 2014 at 3:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If you write __next__, you write in a "raise StopIteration" when it's done. If you write __getattr__, you write in "raise AttributeError" if the attribute shouldn't exist. Those are sufficiently explicit that it should be reasonably clear that the exception is the key. But when you write a generator, you don't explicitly raise: def gen(): yield 1 yield 2 yield 3 return 4 The distinction in __next__ is between returning something and raising something. The distinction in a generator is between "yield" and "return". Why should a generator author have to be concerned about one particular exception having magical meaning? Imagine this scenario: def producer(): """Return user input, or raise KeyboardInterrupt""" return input("Enter the next string: ") def consumer(): """Process the user's input""" while True: try: command = producer() except KeyboardInterrupt: break dispatch(command) Okay, now let's make a mock producer: strings = ["do stuff","do more stuff","blah blah"] def mock_producer() if strings: return strings.pop(0) raise KeyboardInterrupt That's how __next__ works, only with a different exception, and I think people would agree that this is NOT a good use of KeyboardInterrupt. If you put a few extra layers in between the producer and consumer, you'd be extremely surprised that an unexpected KeyboardInterrupt just quietly terminated a loop. Yet this is exactly what the generator-and-for-loop model creates: a situation in which StopIteration, despite not being seen at either end of the code, now has magical properties. Without the generator, *only* __next__ has this effect, and that's exactly where it's documented to be. Does that make for more justification? Unexpected exceptions bubbling up is better than unexpected exceptions quietly terminating loops? ChrisA

On 20 November 2014 02:24, Chris Angelico <rosuav@gmail.com> wrote:
Does that make for more justification? Unexpected exceptions bubbling up is better than unexpected exceptions quietly terminating loops?
The part I found most compelling was when you pointed out that in the special method implementations, the normal return path was always spelled with "return", while the "value missing" result was indicated with a special kind of exception (StopIteration, AttributeError, IndexError or KeyError), and then any other exception was consider unexpected. Generators add the third notion of being able to suspend execution via "yield", which then left them with two different ways of spelling termination inside the frame: "return" OR "raise StopIteration". The second spelling ("raise StopIteration") is then inherently surprising, as it's entirely redundant, *except* in that it allows you to effectively have a "hidden return" in a generator frame that can't be done anywhere else. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Nov 20, 2014 at 02:45:27AM +1000, Nick Coghlan wrote:
I'm not sure that many people outside of this and the python-dev mailing lists would find the use of "raise StopIteration" surprising. Rather, I expect that they will find the use of an explicit "return" inside a generator surprising. People are informally taught that generators use yield *instead of* return, so seeing both in the same function is a surprise. (Most generators quitely fall out the bottom with no explicit end.) I don't claim that doing so is Pythonic or even good practice, but I am sure that there are a lot of people who believe that raising StopIteration to exit a generator is (1) supported and (2) preferred. Examples of code in the wild using StopIteration to exit: http://code.openhub.net/file?fid=ezlejSoT2q7PWrhgNkpdU55MWOA&cid=jVcYOxnQhvU&s=raise%20StopIteration&fp=301369&mp&projSelected=true#L0 http://code.openhub.net/file?fid=M0gWWCpn-avqHO_jnsYcG2T81lg&cid=VKn_M0_GgKM&s=raise%20StopIteration&fp=301283&mp&projSelected=true#L0 http://code.openhub.net/file?fid=pDrrTI8lyh0LO_6rTCk9npC96SE&cid=Y8jg8v1AyqU&s=raise%20StopIteration&fp=41191&mp&projSelected=true#L0 http://code.openhub.net/file?fid=PTjGrE_5rOhyZhL1CUrPBtRk7n8&cid=tWtPpAs4E1g&s=raise%20StopIteration&fp=210789&mp&projSelected=true#L0 http://code.openhub.net/file?fid=WzkucGktJhjsP8cj4BO6Wcnbx-0&cid=fsj7E8vdVMA&s=raise%20StopIteration&fp=401086&mp&projSelected=true#L0 http://stackoverflow.com/questions/6784934/python-yield-and-stopiteration-in... http://stackoverflow.com/questions/14183803/in-pythons-generators-what-is-th... That last example not only uses raise to exit the generator, but the author actually guesses that it is the more Pythonic way to do so. Here is a description of the generator protocol which could easily lead the reader to conclude that raising StopIteration is the correct way to exit a generator: To support this protocol, functions with yield statement are compiled specially as generators. They return a generator object when they are called. The returned object supports the iteration interface with an automatically created __next__() method to resume execution. Generator functions may have a return simply terminates the generation of values by raising a StopIteration exceptions after any normal function exit. http://www.bogotobogo.com/python/python_generators.php At this point, I'm convinced that there is a good argument for a __future__ import changing this behaviour. But I suspect that making this the default behaviour in the future will break a lot of code. -- Steven

On Thu, Nov 20, 2014 at 11:25 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Interesting. But "yield instead of return" doesn't automatically say "and then raise StopIteration to early-abort"; I'd say the informal description is fine, it just needs to be modified differently once people actually want an early abort. ("You can still use 'return' for its other purpose, terminating a function before reaching the end.")
Examples of code in the wild using StopIteration to exit:
Trivially unnecessary, and as soon as there's a bug report, the "What's New In 3.7" page will explain that it needs to be removed.
I have no idea what this one is doing, but it looks like it's half way to what's wanted here. Catch the exception and deal with it... this proposal just means the "deal with it" part needs to be reworded into a return statement. All it needs is for "What's New in 3.5" to recommend use of 'return' instead of 'raise StopIteration', and all these cases will be easily compatible with all [1] versions of Python.
http://stackoverflow.com/questions/6784934/python-yield-and-stopiteration-in...
The accepted answer correctly advises the function be written to simply return. This will work fine. The other answer has a local "raise StopIteration", which can be translated into a simple "return".
The question's author does, but the accepted answer recommends "return". This may result in the odd question here or there, but it's not a major problem. Any time a generator has "raise StopIteration" in its own body, it can simply become "return". That's easy. The issue comes up when it's not raising that itself, but is letting it bubble up - maybe from a next() call. def takewhiletrue(iter): while True: # coincidental with the function name # try: val = next(iter) # except StopIteration: return if not val: break yield val This won't come up in a simple search for "raise StopIteration", and if you have something like this where the condition is almost always going to be reached eventually, you might not notice the problem for a long time. How would you know to add the commented-out lines? What kind of code search would you use to detect this?
Even that does recommend 'return'. If anyone reads that, writes "raise StopIteration", sees code bombing with RuntimeError, and then comes to python-list, we can explain that the recommended method is "return". I have no problem with this. There are plenty of much-worse practices that people pick up - mixing bytes and text, using backslashes in Windows path names without doubling them or using raw literals, etc, etc, etc. In lots of cases they'll seem to work ("C:\Program Files\New Stuff\Testing" will work, until you lower-case the name), but when they break, you just have to fix them. This wouldn't be the first Python minor version to tighten up requirements to remove bug magnets.
I suspect that a huge percentage of the code so broken can be trivially fixed just by search/replacing "raise StopIteration" with "return". There'll be only a very few where the StopIteration is raised from some other function and needs to be caught and dealt with - and fixing those is just as likely to reveal bugs needing fixing. ChrisA

On Thu, Nov 20, 2014 at 12:34:08PM +1100, Chris Angelico wrote:
On Thu, Nov 20, 2014 at 11:25 AM, Steven D'Aprano <steve@pearwood.info> wrote:
The point isn't that it is easy to fix. I'm sure that there will be cases of code that are not easy to fix. The point is that we're breaking working code and causing code churn. We're not fixing a bug. We're changing behaviour people rely on. That ought to make us more conservative about breaking their code.
This may result in the odd question here or there, but it's not a major problem.
And neither is the existing behaviour. We're weighing up whether the small benefit in fixing this wart is worth the pain. The PEP isn't approved yet, and right from the beginning Guido said that he feared that fixing this might be too disruptive. I'm trying to get a feel for how disruptive it will be. I did a quick and informal survey of the developers I work with. The dreaded lurgy has gone through our office, so most of them are away ill, but of those still here (all two of them) one of them couldn't remember whether you exit a generator with "yield nil" or "return nil" (he's a Ruby and Objective-C guy when we're not paying him to write Python) and the other one said that the whole problem is that generators exist in the first place, Python should get rid of them and allow people to define their own using macros (he likes to think of himself as a Lisp and Scheme guru :-) Make of that what you will. -- Steven

On Thu, Nov 20, 2014 at 4:30 PM, Steven D'Aprano <steve@pearwood.info> wrote:
There's a language that lets you define anything you like. It's called "file on disk". If you don't like how it runs, you just put a little shebang at the top and the whole rest of the file is interpreted differently... On one side of the balance is code breakage. On the other side is currently broken code where bugs will be found. Which is the stronger argument? I'm inclined toward the latter, but neither has a huge body of code to back it. ChrisA

On 11/19/2014 04:25 PM, Steven D'Aprano wrote:
We are not, however, responsible for third-party documentation.
Isn't that the case with every __future__ directive that becomes the standard? Folks have an entire minor release to make the adjustment. -- ~Ethan~

On Wed, Nov 19, 2014 at 08:19:41PM -0800, Ethan Furman wrote:
We are not, however, responsible for third-party documentation.
No, of course not, but we should be aware that: * some people believe that raising StopIteration is an acceptable way to exit a generator; and * doing so has worked fine since generators were introduced back in Python 2.2. I wonder whether people who learned about generators back in the 2.2 days will have stronger opinions about raising StopIteration than more recent users? I remember learning that an explicit raise was the way to exit a generator, and sure enough the 2.2 What's New says this: Inside a generator function, the return statement can only be used without a value, and signals the end of the procession of values; afterwards the generator cannot return any further values. return with a value, such as return 5, is a syntax error inside a generator function. The end of the generator’s results can also be indicated by raising StopIteration manually, or by just letting the flow of execution fall off the bottom of the function. https://docs.python.org/3/whatsnew/2.2.html#pep-255-simple-generators That's not to say that we can't change the behaviour, but neither can we say it is undocumented or blame third parties.
Huh, you say that like it's a long time :-) -- Steven

On 11/19/2014 11:24 AM, Chris Angelico wrote:
Which, as I said a week ago, is why there is no need for "raise StopIteration" in a generator function. The doc clearly states the limited intended use of StopIteration. ''' exception StopIteration Raised by built-in function next() and an iterator‘s __next__() method to signal that there are no further items produced by the iterator. ''' StopIteration is exposed so it can be raised in user coded __next__() and caught when using explicit next(). If it was only used for builtins and for loops, it would not need to be visible.
Why should a generator author have to be concerned about one particular exception having magical meaning?
I am not sure of your intent with this rhetorical (?) question.
The prompt should be "Enter the next string or hit ^C to quit: ".
It is avoidable because the return type of producer is limited to strings. Therefore, producer could (and perhaps should) itself catch KeyboardInterrupt and return None, which is intended for such use. Consumer would then be simplified by replacing 3 lines with "if command is None: break".
-- Terry Jan Reedy

On Thu, Nov 20, 2014 at 9:46 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Yes, rhetorical. Basically saying the same as you are: that StopIteration is a part of __next__, not generators.
Yeah, the point is about its interaction with the rest of the program, not the human.
Sure it does. But suppose it does some parsing on the string first, and that parsing might return literally any object. The structure of the program is the same, but now it really does need to signal "no more stuff" in some way other than return value. Just trying to concoct a situation similar to generators/for loops, using a different exception. I'm fairly sure there's no way to make the above system seem truly plausible, because KeyboardInterrupt is a bad exception for the purpose; but it's still broadly similar, and I think the same applies: StopException should be *only* inside __next__() and next(). Since generators can distinguish yield from return, they don't need to distinguish return from raise. ChrisA

On Thu, Nov 20, 2014 at 9:57 AM, Chris Angelico <rosuav@gmail.com> wrote:
Since generators can distinguish yield from return, they don't need to distinguish return from raise.
Bad grammar, should edit before posting. Since generators can distinguish value from no value by using yield and return, they don't need to use yield and raise. ChrisA

On Thu, Nov 20, 2014 at 03:24:07AM +1100, Chris Angelico wrote:
That's not true in practice. See my reply to Nick, there is lots of code out there which uses StopIteration to exit generators. Some of that code isn't very good code -- I've seen "raise StopIteration" immediately before falling out the bottom of the generator -- but until now it has worked and the impression some people have gotten is that it is actually preferred.
Until 3.2, that was a syntax error. For the majority of people who are still using Python 2.7, it is *still* a syntax error. To write this in a backwards-compatible way, you have to exit the generator with: raise StopIteration(2)
I would put it another way: informally, the distinction between a generator and a function is that generators use yield where functions use return. Most people are happy with that informal definition, a full pedantic explanation of co-routines will just confuse them or bore them. The rule they will learn is: * use return in functions * use yield in generators That makes generators that use both surprising. Since most generators either run forever or fall out the bottom when they are done, I expect that seeing a generator with a return in it is likely to surprise a lot of people. I've known that return works for many years, and I still give a double-take whenever I see it in a generator.
Why not? How else are you going to communicate something out of band to the consumer except via an exception? We can argue about whether KeyboardInterrupt is the right exception to use or not, but if you insist that this is a bad protocol then you're implicitly saying that the iterator protocol is also a bad protocol.
You might be, but since I've paid attention to the protocol rules, I won't be. Sorry to be harsh, but how clear do we have to be? StopIteration terminates iterators, and generators are iterators. That rule may or may not be inconvenient, it might be annoying (but sometimes useful), it might hide bugs, it might even be something that we can easily forget until reminded, but if it comes as a "surprise" that just means you don't know how the iterator protocol works. There are good reasons for changing this behaviour, but pandering to people who don't know how the iterator protocol works is not one of them.
That's exactly how the protocol works. Even if you write "return" in your generator, it still raises StopIteration.
Without the generator, *only* __next__ has this effect, and that's exactly where it's documented to be.
The documentation states that __next__ raises StopIteration, it doesn't say that *only* __next__ should raise StopIteration. https://docs.python.org/3/library/stdtypes.html#iterator.__next__ I trust that we all expect to be able to factor out the raise into a helper function or method, yes? It truly would be surprising if this failed: class MyIterator: def __iter__(self): return self def __next__(self): return something() def something(): # Toy helper function. if random.random() < 0.5: return "Spam!" raise StopIteration Now let's write this as a generator: def gen(): while True: yield something() which is much nicer than: def gen(): while True: try: yield something() except StopIteration: return # converted by Python into raise StopIteration -- Steven

On Thu, Nov 20, 2014 at 1:06 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Yes, I thought it was rare. I stand corrected. Reword that to "you don't *need to* explicitly raise", since you can simply return, and it becomes true again, though.
In most cases you won't need to put a value on it, so bare "return" will work just fine. I just put a return value onto it so it wouldn't look trivially useless.
But it's just as surprising to put "raise StopIteration" into it. It's normal to put that into __next__, it's not normal to need it in a generator. Either way, it's something unusual; so let's go with the unusual "return" rather than the unusual "raise".
Well, that's exactly what I do mean. KeyboardInterrupt is not a good way for two parts of a program to communicate with each other, largely because it can be raised unexpectedly. Which is the point of this PEP: raising StopIteration unexpectedly should also result in a noisy traceback.
Sure. There was a suggestion that "return yield from something()" would work, though, which - I can't confirm that this works, but assuming it does - would be a lot tidier. But there's still a difference. Your first helper function was specifically a __next__ helper. It was tied intrinsically to the iterator protocol. If you want to call a __next__ helper (or actually call next(iter) on something) inside a generator, you'll have to - if this change goes through - cope with the fact that generator protocol says "return" where __next__ protocol says "raise StopIteration". If you want a generator helper, it'd look like this: def something(): # Toy helper function. if random.random() < 0.5: yield "Spam!" def gen(): yield from something() Voila! Now it's a generator helper, following generator protocol. Every bit as tidy as the original. Let's write a __getitem__ helper: def something(x): # Toy helper function. if random.random() < 0.5: return "Spam!" raise KeyError(x) class X: def __getitem__(self, x): return something(x) Same thing. As soon as you get into raising these kinds of exceptions, you're tying your helper to a specific protocol. All that's happening with PEP 479 is that generator and iterator protocol are being distinguished slightly. ChrisA

On 20.11.2014 03:24, Chris Angelico wrote:
Hmm, I'm not convinced by these toy examples, but I did inspect some of my own code for incompatibility with the proposed change. I found that there really is only one recurring pattern I use that I'd have to change and that is how I've implemented several file parsers. I tend to write them like this: def parser (file_object): while True: title_line = next(file_object) # will terminate after the last record try: # read and process the rest of the record here except StopIteration: # this record is incomplete raise OSError('Invalid file format') yield processed_record So I'm catching StopIteration raised by the underlying IOWrapper only if it occurs in illegal places (with regard to the file format the parser expects), but not when it indicates the end of a correct file. I always thought of letting the Error bubble up as a way to keep the parser transparent. Now in this case, I think, I would have to change this to: def parser (io_object): while True: try: title_line = next(io_object) except StopIteration: return ... which I could certainly do without too much effort, but could this be one of the more widespread sources of incompatibility that Steve imagines ? Wolfgang

On Thu, Nov 20, 2014 at 2:39 PM, Wolfgang Maier < wolfgang.maier@biologie.uni-freiburg.de> wrote:
There's probably something important missing from your examples. The above while-loop is equivalent to for title_line in io_object: ... If you're okay with getting RuntimeError instead of OSError for an undesirable StopIteration, you can just drop the except clause altogether. -- --Guido van Rossum (python.org/~guido)

On 21.11.2014 00:51, Guido van Rossum wrote:
My reason for not using a for loop here is that I'm trying to read from a file where several lines form a record, so I'm reading the title line of a record (and if there is no record in the file any more I want the parser generator to terminate/return. If a title line is read successfully then I'm reading the record's body lines inside a try/except, i.e. where it says "# read and process the rest of the record here" in my shortened code I am actually calling next several times again to retrieve the body lines (and while reading these lines an unexpected StopIteration in the IOWrapper is considered a file format error). I realize that I could also use a for loop and still call next(file_object) inside it, but I find this a potentially confusing pattern that I'm trying to avoid by using the while loop and all explicit next(). Compare: for title_line in file_object: record_body = next(file_object) # in reality record_body is generated using several next calls # depending on the content found in the record body while it's read yield (title_line, record_body) vs while True: title_line = next(file_object) body = next(file_object) yield (title_line, body) To me, the for loop version suggests to me that the content of file_object is read in line by line by the loop (even though the name title_line tries to hint at this being not true). Only when I inspect the loop body I see that further items are retrieved with next() and, thus, skipped in the for iteration. The while loop, on the other hand, makes the number of iterations very clear by showing all of them in the loop body. Would you agree that this is justification enough for while instead of for or is it only me who thinks that a for loop makes the code read awkward ?
If you're okay with getting RuntimeError instead of OSError for an undesirable StopIteration, you can just drop the except clause altogether.
Right, I could do this if the PEP-described behavior was in effect today.

On Fri, Nov 21, 2014 at 9:19 PM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
I agree. The last example in the PEP is a cut-down form of your parser, and I raise the exact same concern: https://www.python.org/dev/peps/pep-0479/#examples The use of the for loop strongly implies that the loop body will be executed once for each thing in the iterable, which isn't true if you next() it in the body. Legal? Sure. Confusing? Definitely. ChrisA

On 20.11.2014 03:06, Steven D'Aprano wrote:
I find this example a compelling argument against the PEP. Personally, I'm dealing a lot more often with refactoring a generator function into a iterator class than I'm rewriting generator expressions into comprehensions (at least the exotic kinds that would reveal their inequality). So for me at least, the burden of having to remember that I can let (and should let) StopIteration bubble up inside __next__, but not in generator functions weighs in heavier than the equality argument and the protection against hard-to-diagnose (but rarely occurring) bugs in nested generator functions.

On Fri, Nov 21, 2014 at 9:58 AM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
Compare my earlier response to Steven, though: it's not difficult to refactor a generator into a helper-generator, rather than refactor a generator into a helper-__next__. This proposal would force a decoupling of generator protocol from __next__ protocol. The ugliness in Steven's examples comes from trying to use a __next__ helper in a generator. It'd be just as ugly trying to refactor __getitem__ to make use of a __getattr__ helper - you'd have to catch AttributeError and turn it into KeyError at the boundary between the two protocols. ChrisA

Please let me know if I'm reading the PEP correctly. Does the proposal break all existing code in generators that uses next() to raise StopIteration or that raises StopIteration explicitly? For example, here is the pure python recipe for itertools.accumulate() show in the docs at https://docs.python.org/3/library/itertools.html#itertool-functions <https://docs.python.org/3/library/itertools.html#itertool-functions> : def accumulate(iterable, func=operator.add): 'Return running totals' # accumulate([1,2,3,4,5]) --> 1 3 6 10 15 # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120 it = iter(iterable) total = next(it) yield total for element in it: total = func(total, element) yield total Or would it break the traditional examples of how to write something like izip() using a generator? def izip(iterable1, iterable2): it1 = iter(iterable1) it2 = iter(iterable2) while True: v1 = next(it1) v2 = next(it2) yield v1, v2 assert list(izip('ab', 'cde')) == [('a', 'c'), ('b', 'd')] assert list(izip('abc', 'cd')) == [('a', 'c'), ('b', 'd')] My initial reading of the PEP was a bit unsettling because the listed examples (such as unwrap() and parser()) were a series of cases where code that was currently working just fine for the last decade would break and need be changed to less pleasant looking code. Also, the PEP motivation seemed somewhat weak. Instead of listing known bugs or real-world development difficulties, it seems to hinge almost entirely some "being surprised" that list comprehensions and generator expressions aren't the same in every regard (they aren't). AFAICT, that suggestion is that an incorrect expectation of perfect symmetry warrants a number of what the author calls "consequences for existing code". It seems that if the real problem is one of false expectations or surprises, the direct solution would be to provide clearer examples of how things actually work and to disabuse the idea that list comprehensions and generator expressions are more interchangeable than they actually are. Raymond P.S. On a more general note, I think that our biggest problem in the Python world is getting people to switch to Python 3. If we really want that to happen, we should develop a strong aversion to proposals that further increase the semantic difference between Python 2 and Python 3.

On Fri, Nov 21, 2014 at 10:24 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
The case where the iterable is empty would now raise, yes.
Yes, this would be affected. This proposal causes a separation of generators and iterators, so it's no longer possible to pretend that they're the same thing.
The main point is one of exceptions being silently suppressed. Iterator protocol involves the StopIteration exception; generator protocol doesn't, yet currently a generator that raises StopIteration will quietly terminate. It's as if every generator is wrapped inside "try: ..... except StopIteration: pass". Would you accept any function being written with that kind of implicit suppression of any other exception?
The recommended form of the code will work exactly the same way in both versions: explicitly catching StopIteration and using it as a signal that the function should terminate. The only difference is the behaviour of the non-recommended practice of allowing an exception to bubble part-way and then be implicitly caught. ChrisA

On 21 November 2014 21:50, Chris Angelico <rosuav@gmail.com> wrote:
Raymond's point is that for a long time, the equivalence between "return" and "raise StopIteration" in a generator function has been explicit. The dissatisfaction with the "non-local flow control" aspects of the latter really only started to creep in around Python 2.5 (based on the explicit decision to avoid non-local flow control behaviour in the definition of the with statement in PEP 343), and this PEP is the first time this longstanding behaviour of generators has been seriously questioned at the python-dev level. Guido also didn't add himself as a co-author on the PEP, so it isn't clear on first reading that *he's* the one considering the change, rather than it being an independent suggestion on your part :) I suspect enough evidence of breakage is accumulating to tip the balance back to "not worth the hassle", but it would also be possible to just *add* the "from __future__ import generator_stop" feature, and postpone a decision on making that the only available behaviour. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Nov 21, 2014, at 10:55 PM, Nick Coghlan wrote:
I have no opinion on the actual PEP, but I'm not sure the above is a good resolution. future imports should be for things that have a clear path to default behavior in some future release. I don't think we should incur technical debt to future-ize a feature that won't eventually get adopted. Such a thing will just be another wart that will be difficult to remove for backward compatibility. Cheers, -Barry

On Fri, Nov 21, 2014 at 10:50:52PM +1100, Chris Angelico wrote:
On Fri, Nov 21, 2014 at 10:24 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
But generators and iterators *are the same thing*. (Generator functions are not iterators, but generators themselves are.) Iterators don't have a specific type, but they obey the iterator protocol: py> def gen(): ... yield 1 ... yield 2 ... py> it = gen() py> iter(it) is it True py> hasattr(it, '__next__') True `it` is an iterator.
Yes. That's how the classic pre-iterator iteration protocol works: py> class K: ... def __getitem__(self, i): ... if i == 5: raise IndexError ... return i ... py> x = K() py> list(x) [0, 1, 2, 3, 4] Context managers support suppressing any exception which occurs: If the suite was exited due to an exception, and the return value from the __exit__() method was false, the exception is reraised. If the return value was true, the exception is suppressed, and execution continues with the statement following the with statement. https://docs.python.org/3/reference/compound_stmts.html#the-with-statement So there's two examples, one of the oldest going back to Python 1 days, and one of the newest. There may be others. -- Steven

On Sat, Nov 22, 2014 at 3:30 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I can write many other factory functions which return iterators. They are not, themselves, iterators, and therefore should not be expected to follow iterator protocol. def gen(): return iter([1,2])
The above function works with those tests, too. Generator functions are functions that return iterators, and the __next__ method of the returned object is what follows iterator protocol.
That's following getitem protocol, and it's part of that protocol for the raising of IndexError to be the way of not returning any value. But what's more surprising is that raising StopIteration will also silently halt iteration, which I think is not good:
list(K()) [0, 1, 2, 3, 4]
Context managers get a chance to function like a try/except block. If one silently and unexpectedly suppresses an exception, it's going to be surprising; but more likely, it's as clear and explicit as an actual try/except block. This isn't "as soon as you use a 'with' block, any XyzError will jump to the end of the block and keep going". ChrisA

On Fri, Nov 21, 2014 at 08:52:59AM -0800, Ethan Furman wrote:
"Must not support send()" has never been part of the definition of iterators. The `Iterator` ABC also recognises generators as iterators: py> def gen(): ... yield 1 ... py> from collections import Iterator py> isinstance(gen(), Iterator) True and they are documented as iterators: Python’s generators provide a convenient way to implement the iterator protocol. If a container object’s __iter__() method is implemented as a generator, it will automatically return an iterator object (technically, a generator object) supplying the __iter__() and __next__() methods. https://docs.python.org/3/library/stdtypes.html#generator-types I don't understand where this idea that generators aren't iterators has come from, unless it is confusion between the generator *function* and the generator object itself. -- Steven

On Nov 21, 2014, at 8:52, Ethan Furman <ethan@stoneleaf.us> wrote:
Generators are a subtype of iterators. They support the iterator protocol completely, and in the same way as any other iterator. They also support extensions to that protocol--e.g., send(). But they also have a relationship to a generator function or generator expression, which you could call a "protocol" but if so it's not one expressible at the level of the language. I think that leads to a bit of confusion when speaking loosely. When someone says "the generator protocol vs. the iterator protocol" the "obviously correct" meaning is send and throw, but it's not what people always mean. Then again, the word "generator" itself leads to confusion when speaking loosely. Maybe it would be clearer if "generator" had no meaning; generator functions return generator iterators. But I don't think this confusion has caused serious problems over the decades, so I doubt the more minor confusion at issue here is likely to be serious.

On Fri, Nov 21, 2014 at 9:18 AM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
interesting -- I've always called those "generator comprehensions" -- but anyway, -- do they have a special relationship? I can put any iterable in a generator expression: gen_exp = (i for i in [3,4,5,6]) the result is a generator: In [5]: type(gen_exp) Out[5]: generator so I guess you could call that a "special relationship" -- but it looks to me kind of like an alternate constructor. But in any case, you can use a generator created by a generator expression or a generator function the same way you can use a iterable or an iterator class. Then again, the word "generator" itself leads to confusion when speaking
loosely. Maybe it would be clearer if "generator" had no meaning; generator functions return generator iterators.
not sure how that would help -- a generator is a type, and it is created by either calling a generator function or a generator expression. if there is confusion, it's when folks call a generator function a "generator" Anyway, I just went back and read the PEP, and I'm still confused -- would the PEP make generators behave more like iterator classes, or less like them? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Nov 22, 2014 at 4:51 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Neutral. A generator function, an iterator class, etc, etc, etc, exists solely to construct an iterator. That iterator has a __next__ method, which either returns a value or raises StopIteration, or raises some other exception (which bubbles up). There are two easy ways to write iterators. One is to construct a class: class Iter: def __init__(self): self.x = 0 def __iter__(self): return self def __next__(self): if self.x == 3: raise StopIteration self.x += 1 return self.x Another is to write a generator function: def gen(): yield 1 yield 2 yield 3 Both Iter and gen are callables which return iterators. Both of them will produce three integers and then raise StopIteration. Both will, as is good form for iterators, continue to raise StopIteration thereafter. And neither Iter nor gen is, itself, an iterator. One is a class which constructs iterators. The other is a generator function, which also constructs iterators. That's all. In Iter.__next__, I wrote code which chose between "return" and "raise StopIteration" to define its result; in gen(), I wrote code which chose between "yield" and "return" (in this case, the implicit return at the end of the function) to define its result. The only change made by this proposal is that StopIteration becomes, in a generator, like any other unexpected exception. It creates a separation between "iterator protocol" (which is implemented by __next__) and "generator protocol" (which is written in the body of a function with 'yield' in it). ChrisA

As someone who has written maybe one generator expression in production code, I have little opinion on the PEP. But as someone that teaches Python, I have a comment on: On Fri, Nov 21, 2014 at 10:50:52PM +1100, Chris Angelico wrote:
As pointed out by Steven, the _are_ the same thing. When I teach interators and generators, I get a bit tangled up explaining what the difference is, and why Python has both. This is what I say: Conceptually ( outside of language constructs): An "generator" is something that, well, generates value on the fly, as requested, until there are no more to generate, and then terminates. A "iterator" on the other had is something that produces the values in a pre-existing sequence of values, until there are no more. IN practice, python uses the exact same protocol (the iterator protocol -- __iter__, __next__) for both, so that you can write, e.g. a for loop, and not have to know whether the underlying object you are looping through is iterating or generating... As you can write a "generator" in the sense above in a class that supports the iterator protocol (and, can, in fact, write an "iterator" with a generator function), then I say that generator functions really are only syntactic sugar -- they are short and sweet and do much of the book keeping for you. But given all that keeping the protocols as similar as possible is a *good* thing, not a bad one -- they should behave as much as possible teh same. If StopIteration bubbles up from inside an iterator, wouldn't that silently terminate as well? Honestly, I'm a bit lost -- but my point is this -- generators and iterators should behave as much the same as possible. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Nov 22, 2014 at 3:53 AM, Chris Barker <chris.barker@noaa.gov> wrote:
If you want to consider them that way, then sure - but part of the bookkeeping they do for you is the management of the StopIteration exception. That becomes purely an implementation detail. You don't actually use it when you write a generator function. ChrisA

On Fri, Nov 21, 2014 at 8:53 AM, Chris Barker <chris.barker@noaa.gov> wrote:
I'm sorry you see it that way; we must have done a terrible job explaining this in the past. :-( The behavior for the *consumer* of the iteration is unchanged (call next() until it raises StopIteration -- or let a for-loop take care of the details for you). The interface for the *producer* has never been all that similar: In a generator you *yield* subsequent values until you are done; but if you are not using a generator, you must define a __next__() method (next() in Python 2) that *returns* a single value each time, until it's done, and then it has to raise StopIteration. There is no need to raise StopIteration from a generator, you just return when you are done. Insisting that raising StopIteration in a generator makes it more similar to a __next__() method ignores the fact that producing values is done in a completely different ways. So, again, the PEP does not change anything about iterators, and generators will continue to follow the iterator protocol. The change is only for generator authors (and, most importantly, for people using a certain hack in generator expressions). -- --Guido van Rossum (python.org/~guido)

On Fri, Nov 21, 2014 at 2:29 PM, Guido van Rossum <guido@python.org> wrote:
well, others have found examples in old docs that mingle StopIteration and generators...so I guess so, but I'm not sure I'm that misinformed. It still seems to me that there are two ways to write the same thing. The behavior for the *consumer* of the iteration is unchanged
got it -- the issue at hand is what happens to a StopIteration that is raised by something the generator calls. I think the point of this PEP is that the author og a generator function is thinking about using "yield" to provide the next value, and return (explicit or implicit) to stop the generation of objects. That return is raise a StopIteration, but the author isn't thinking about it. So why would they want to think about having to trap StopIteration when calling other functions. While the author of a iterator class is thinking about the __next__ method and raising a StopIteration to terminate. So s/he would naturally think about trapping StopIteration when calling functions? I suppose that makes some sense, but to me it seems like a generator function is a different syntax for creating what is essentially the same thing -- why shouldn't it have the same behavior? and of you are writing a generator, presumably you know how it's going to get use -- i.e. by somethign that expects a StopIteration -- it's not like you're ignorant of the whole idea. Consider this far fetched situation: Either a iterator class or a generator function could take a function object to call to do part of its work. If that function happened to raise a StopIteration -- now the user would have to know which type of object they were workign with, so they would know how to handle the termination of the iter/gener-artion OK -- far more far fetched than the proceeding example of confusion, but the point is this: AFAIU, the current distinction between generators and iterators is how they are written -- i.e. syntax, essentially. But this PEP would change the behavior of generators in some small way, creating a distinction that doesn't currently exist. So, again, the PEP does not change anything about iterators, and generators
will continue to follow the iterator protocol. The change is only for generator authors
I guess this is where I'm not sure -- it seems to me that the behavior of generators is being change, not the syntax -- so while mostly of concern to generator authors, it is, in fact, a chance in behavior that can be seen by the consumer of (maybe only an oddly designed) generator. In practice, that difference may only matter to folks using that particular hack in generator expression, but it is indeed a change. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Nov 22, 2014 at 10:06 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Let's suppose you use a Python-like macro language to generate Python code. In the macro language, you can write stuff like this: class X: iterator: return 5 return 6 return 7 iterator_done And it will get compiled into something like this: class X: def __init__(self): self._iter_state = 0 def __iter__(self): return self def __next__(self): self._iter_state += 1 if self._iter_state == 1: return 5 if self._iter_state == 2: return 6 if self._iter_state == 3: return 7 raise StopIteration This is a reasonably plausible macro language, right? It's basically still a class definition, but it lets you leave out a whole bunch of boilerplate. Now, the question is: As you write the simplified version, should you ever need to concern yourself with StopIteration? I posit no, you should not; it's not a part of the macro language at all. Of course, if this *were* how things were done, it would probably be implemented as a very thin wrapper, exposing all its details to your code; but there's no reason that it *needs* to be so thin. The language you're writing in doesn't need to have any concept of a StopIteration exception, because it doesn't need to use an exception to signal "no more results".
Not necessarily. Can we get someone here who knows asyncio and coroutines, and can comment on the use of such generators?
Either a __getattr__ or a __getitem__ could use a helper function to do part of its work, too, but either the helper needs to know which, or it needs some other way of signalling. They're different protocols, so they're handled differently. If Python wanted to conflate all of these, there could be a single "NoReturnValue" exception, used by every function which needs to be able to return absolutely any object and also to be able to signal "I don't have anything to return". But no, Python has separate exceptions for signalling "I don't have any such key", "I don't have any such attribute", and "I don't have any more things to iterate over". Generators don't need any of them, because - like my macro language above - they have two different keywords and two different byte-codes (yield vs return). In many cases, the helper function doesn't actually need the capability to return *absolutely any object*. In that case, the obvious solution would be to have it return None to say "nothing to return", and then the calling function can either translate that into the appropriate exception, or return rather than yielding, as appropriate. That would also make the helper more useful to other stand-alone functions. But even if your helper has to be able to return absolutely anything, you still have a few options: 1) Design the helper as part of __next__, and explicitly catch the exception. def nexthelper(): if condition: return value raise StopIteration def __next__(self): return nexthelper() def gen(): try: yield nexthelper() except StopIteration: pass 2) Write the helper as a generator, and explicitly next() it if you need that: def genhelper(): if condition: yield value def __next__(self): return next(genhelper()) def gen(): yield from genhelper() 3) Return status and value. I don't like this, but it does work. def tuplehelper(): if condition: return True, value return False, None def __next__(self): ok, val = tuplehelper() if ok: return val raise StopIteration def gen(): ok, val = tuplehelper() if ok: yield val All these methods work perfectly, because they have a clear boundary between protocols. If you want to write a __getitem__ that calls on the same helper, you can do that, and have __getitem__ itself raise appropriately if there's nothing to return.
Generators are currently a leaky abstraction for iterator classes. This PEP plugs a leak that's capable of masking bugs.
The only way a consumer will see a change of behaviour is if the generator author used this specific hack (in which case, instead of the generator quietly terminating, a RuntimeError will bubble up - hopefully all the way up until a human sees it). In terms of this PEP, that's a bug in the generator. Bug-free generators will not appear any different to the consumer. ChrisA

On Sat, Nov 22, 2014 at 10:40:37AM +1100, Chris Angelico wrote:
Let's suppose you use a Python-like macro language to generate Python code. In the macro language, you can write stuff like this:
Is there really any point in hypothesing imaginary macro languages when we have a concrete and existing language (Python itself) to look at? [snip made-up example]
Sure, why not? It is part of the concrete protocol: iterators raise StopIteration to halt. That's not a secret, and it is not an implementation detail, it is a concrete, public part of the API.
I posit no, you should not; it's not a part of the macro language at all.
This is why talking about imaginary macro languages is pointless. You say it is not part of the macro language. I say it is. Since the language doesn't actually exist, who is to say which is right? In real Python code, "raise StopIteration" does exist, and does work in generators. Sometimes the fact that it works is a nuisance, when you have an unexpected StopIteration. Sometimes the fact that it works is exactly what you want, when you have an expected StopIteration. You seem to think that allowing a generator function to delegate the decision to halt to a helper function is a Bad Thing. I say it is a Good Thing, even if it occasionally makes buggy code a bit harder to debug. [...]
What about them? I don't understand your question. -- Steven

On Sat, Nov 22, 2014 at 9:48 PM, Steven D'Aprano <steve@pearwood.info> wrote:
A generator function is exactly the same thing: it's a way to create an iterator, but it's not a class with a __next__ function. I could write an iterator-creation function in many ways, none of which involve StopIteration: def gen(): if condition: raise StopIteration # Wrong return iter([1,2,3])
If you have a lengthy nested chain of coroutines, and one of them unexpectedly raises StopIteration, is it right for something to quietly terminate, or should the exception bubble up and be printed to console? ChrisA

On Sat, Nov 22, 2014 at 2:57 AM, Chris Angelico <rosuav@gmail.com> wrote:
Couldn't you have a nested pile of iterator classes as well that would exhibit the exact same behavior? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 3:41 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Potentially, but that would be a different thing. Also, I don't know of cases where a __next__ function chains to a next() call through arbitrary numbers of levels, but :"yield from" gets a solid work-out in asyncio and related, so it's more likely to come up. But I don't personally use asyncio, so I'd like to hear from someone who does. ChrisA

I think this is a good point. Maybe a way to obtain equivalency to the generator functions in this case is to "break" this example for the iterator object as well, in that StopIteration has to be raised in the frame of the generator object; if it raised in a different context, e.g., a function called by __next__, that StopIteration should also be converted to a RuntimeError similar to what is proposed in the PEP for the generator functions. Maybe this is not what Chris intends to happen, but it would make things consistent. -Alexander

On Fri, Nov 21, 2014 at 4:56 PM, Alexander Heger <python@2sn.net> wrote:
I"mn not sure which Chris you are refering to, but for my part, yes and no: Yes, that would keep iterator classes and generator functions consistent, which would be a good thing. No: I don't think we should do that -- StopIteration is part of the iterator protocol -- generators are another way to write something that complies with the iterator protocol -- generators should handle StopIteration the same way that iterator classes do. Yes, there are some cases that can be confusing and hard to debug -- but them's the breaks. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 3:38 AM, Chris Barker <chris.barker@noaa.gov> wrote:
That's one of the perils of geeky communities - there'll often be multiple people named Chris. I have a brother named Michael who's a rail enthusiast, and that world has the same issue with his name - he was once in a car with three other people named Michael.
I've done my "explain it twice, then shut up" on this subject, so I'll just point you to the list archive, where it's been stated clearly that generators are like __iter__, not like __next__. Please, could you respond to previously-given explanations, rather than simply restating that generators should be like __next__? I'd like to move forward with that discussion, rather than reiterating the same points. ChrisA

On Mon, Nov 24, 2014 at 9:06 AM, Chris Angelico <rosuav@gmail.com> wrote:
I'm not sure if I've responded or not to previously given explanations -- but you're right, it's time for me to shut up having made my point, too. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 4:18 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Well, there is probably more to be said about this - along the lines of *why* generators ought to be more like iterators. (They're iterables, not iterators.) It's just that we seem to be rehashing the same arguments - or maybe that's just my impression, as there's been discussion on three different channels (-ideas, -dev, and the issue tracker - mercifully very little on the latter). ChrisA

On Mon, Nov 24, 2014 at 10:02 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
I think Chris A was overzealous here. The word "generator" is ambiguous; it can refer to either a generator function (a function definition containing at least one "yield") or to the object you obtain by calling a generator function. The latter is definitely an iterator (it has a __next__ method). You can't really call a generator function an iterable (since calling iter() on it raises TypeError) but it's not an iterator either. For the rest see my explanation in response to Mark Shannon in python-dev: http://code.activestate.com/lists/python-dev/133428/ -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 25, 2014 at 5:02 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
My apologies. As Guido said, "generator" is ambiguous; though I was inaccurate as well. A generator *object* is, as you show above, an iterator; a generator *function* is not actually iterable, but it is an iterator factory. An iterator class is also an iterator factory. ChrisA

On Mon, Nov 24, 2014 at 5:48 PM, Chris Angelico <rosuav@gmail.com> wrote:
This is correct, and I don't think there is any ambiguity:
As explained in PEP 255, "a Python generator is a kind of Python iterator[1], but of an especially powerful kind." The other term introduced by PEP 255 is "generator function": "A function that contains a yield statement is called a generator function." In my view, PEP 479 naturally follows from careful reading of PEP 225. All one needs to understand is the difference between a function that returns an iterator and its value.

yes, this was a reply to your post
Yes, that would keep iterator classes and generator functions consistent, which would be a good thing.
I think the main goal was to have a consistent interface that is easy to debug and deals with StopIteration bubbling up - hence such Exception from other scopes should convert to RuntimeError when crossing the iterator interface boundary originating from other scopes.
You'd Keep StopIteration in the protocol, but only allow it in the local scope. -Alexander

On 21.11.2014 12:24, Raymond Hettinger wrote:
Since I already learnt quite a lot from following this thread: I checked yesterday what the docs have to say about the pure-python equivalent of python3's zip() because I expected it to look like the above izip recipe (making it incompatible with the PEP behavior). However, I found that the given equivalent code is: def zip(*iterables): # zip('ABCD', 'xy') --> Ax By sentinel = object() iterators = [iter(it) for it in iterables] while iterators: result = [] for it in iterators: elem = next(it, sentinel) if elem is sentinel: return result.append(elem) yield tuple(result) i.e., there is no unprotected next call in this example. What surprised me though is that the protection here is done via the default argument of next() while more typically you'd use a try/except clause. So what's the difference between the two ? Specifically, with a default value given will next just catch StopIteration, which you could do less verbosely yourself and/or is there some speed gain from the fact that the Error has to be propagated up one level less ? Is there a guideline when to use try/except vs. next with a default value ? Thanks, Wolfgang

On 11/21/2014 03:24 AM, Raymond Hettinger wrote:
I believe the motivation is more along the lines of the difficulty and time wasted in debugging a malfunctioning program when a generator stops early because a StopIteration escaped instead of having some other exception raised. This would be along the same lines as not allowing sum to work with str -- a more valid case, IMO, because the sum restriction is performance based, while this change would actually prevent breakage... or more accurately, put the breakage at the cause and make it much easier to fix. -- ~Ethan~

On 15.11.2014 10:29, Chris Angelico wrote:
Now that this PEP is going to be accepted, I'm not sure how much sense it still makes to suggest an amendment to it, but anyway: As stated in the abstract one of the goals of the PEP is to unify further the behaviour of comprehensions and generator expressions. With the PEP in place the following example (taken from Steven d'Aprano's post on python-list): iterable = [iter([])] list(next(x) for x in iterable) would raise an error just like [next(x) for x in iterable] already does today. However the comprehension currently raises StopIteration, while the proposed error for the generator expression would be of a different class (supposedly RuntimeError) - so comprehensions and generator expressions would still behave a bit (though much less) differently. In addition, the PEP leaves an iterator's __next__() method as the only reasonable place where user-code should raise StopIteration. So I would like to argue that instead of just turning StopIteration into some other error when it's about to bubble out of a generator frame, it should be converted whenever it bubbles out of *anything except an iterator's __next__()*. This would include comprehensions, but also any other code. (On the side, I guess the current form of the PEP does address hard-to-debug bugs caused by nested generators, but what about nested __next__ in iterators ? Shouldn't it using the same logic also be an error if a next call inside a __next__ method raises an uncaught StopIteration ?) I think such general behavior would make it much clearer that StopIteration is considered special and reserved for the iterator protocol. Of course, it might mean more broken code if people use StopIteration or a subclass for error signaling outside generator/iterators, but this PEP will mean backwards incompatibility anyway so why not go all the way and do it consistently. I'm not sure I'd like the pretty general RuntimeError for this (even though Guido favors it for simplicity), instead one could call it UnhandledStopIteration ? I imagine that a dedicated class would help in porting, for example, python2 code to python3 (which this PEP does not really simplify otherwise) since people/scripts could watch out for something specific ? Thoughts? Wolfgang

On Tue, Nov 25, 2014 at 9:53 AM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
There'd have to be a special case for next(), where StopIteration is part of the definition of the function. The question then becomes, what's the boundary where StopIteration is converted? The current proposal is quite simple. All the conversion happens in the one function that (re)starts a generator frame, gen_send_ex() in Objects/genobject.c. To do this for other functions, there'd need to be some way of saying which ones are allowed to raise StopIteration and which aren't. Additionally, the scope for problems is smaller. A StopIteration raised anywhere outside of a loop header won't cause silent behavioral change; with generators, anywhere in the body of the function will have that effect. So, while I do agree in principle that it'd be nice, I don't know that it's practical; however, it might be worth raising a dependent proposal to extend this exception conversion. ChrisA

On 11/25/2014 12:03 AM, Chris Angelico wrote:
Well, I'm not familiar with every implementation detail of the interpreter so I can't judge how difficult to implement certain things would be, but one solution that I could think of is: allow StopIteration to be raised anywhere, but let it bubble up only *one* frame. So if the next outer frame does not deal with it, the exception would be converted to UnhandledStopIteration (or something else) when it's about to bubble out of that outer frame. The builtin next() would simply reset the frame count by catching and reraising StopIteration raised inside its argument (whether that's an iterator's __next__ or a generator; note that in this scenario using raise StopIteration instead of return inside a generator would remain possible). Examples of what would happen: using next on a generator that raises StopIteration explicitly: => next catches the error and reraises StopIteration using next on a generator that returns: => next behaves like currently, raising StopIteration using next on the __next__ method of an iterator: => next catches the error and reraises StopIteration every direct call of an iterator's __next__ method: => has to be guarded by a try/except StopIteration Likewise in the first three cases, the calling frame, which resumes when next returns, (and only this frame) is given a chance to handle the error. If that doesn't happen (i.e. the error would bubble out) it gets converted. So different from the current PEP where a StopIteration must be dealt with explicitly using try/except only inside generators, but bubbles up everywhere else, here StopIteration will be special everywhere, i.e., it must be passed upwards explicitly through all frames or will get converted. Back to Steven's generator expression vs comprehension example: iterable = [iter([])] list(next(x) for x in iterable) would raise UnhandledStopIteration since there is no way, inside the generator expression to catch the StopIteration raised by next(x). ... and if that's all complete nonsense because of some technical detail I'm not aware of, then please excuse my ignorance. Wolfgang

On Tue, Nov 25, 2014 at 9:30 AM, Wolfgang Maier < wolfgang.maier@biologie.uni-freiburg.de> wrote:
I also have no idea if this is practical from an implementation perspective, but I like how it support my goal of keeping the behavior of iterator classes and generators consistent. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 9:47 AM, Chris Barker <chris.barker@noaa.gov> wrote:
[...] I like how it support my goal of keeping the behavior of iterator classes and generators consistent.
This is a great summary of the general confusion I am trying to clear up. The behavior of all types of iterators (including generators) from the *caller's* perspective is not in question and is not changing. It is very simple: you call next(it) (or it.__next__(), and it returns either the next value or raises StopIteration (and any other exception is, indeed, an exception). producing a None value by the caller; returning from a generator is translated into a StopIteration which will be interpreted by the caller as the end of series. -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 25, 2014 at 9:59 AM, Guido van Rossum <guido@python.org> wrote:
Once you start nesting these things, the distinction between "implementor" and "caller" gets mingled. And I think this is all about how nested generators behave, yes? If I am implementing a iterator of some sort (generator function or iterator class), and I call next() inside my code, then I am both a implementor and caller. And if I'm also writing helper functions, then I need to know about how StopIteration will be handled, and it will be handled a bit differently by generators and iterator classes. But not a big deal, agreed, probably a much smaller deal that all the other stuff you'd better understand to write this kind of code anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 12:31 PM, Chris Barker <chris.barker@noaa.gov> wrote:
Hm. An implementer of one protocol is likely the caller of many other protocols. It's not clear that calling something that implements the *same* protocol should deserve special status. For example I could be implementing an iterator processing the lines of a CSV file. Inside this iterator I may be using another iterator that loops over the fields of the current line. (This doesn't sound too far-fetched.) But if I run out of fields in a line, why should that translate into terminating the outer iterator? And the outer iterator may itself be called by another, more outer iterator that iterators over a list of files.
And I think this is all about how nested generators behave, yes?
The inner iterator doesn't have to be a generator (apart from send() and throw(), they have the same interface). And the point of the PEP is that an exhausted inner iterator shouldn't be taken to automatically terminate the outer one. (You could point out that I don't do anything about the similar problem when the outer iterator is implemented as a class with a __next__() method. If I could, I would -- but that case is different because there you *must* raise StopIteration to terminate the iteration, so it becomes more similar to an accidental KeyError being masked when it occurs inside a __getitem__() method.)
A helper function also defines an interface. If you are writing a helper function for a generator (and the helper function is participating in the same iteration as the outer generator, i.e. not in the CSV fies / lines / fields example), the best way to do it is probably to write it as a helper generator, and use "yield from" in the outer generator.
But not a big deal, agreed, probably a much smaller deal that all the other stuff you'd better understand to write this kind of code anyway.
Which I'm sorry to see is much less widely understood than I had assumed. -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 25, 2014 at 1:05 PM, Guido van Rossum <guido@python.org> wrote:
(You could point out that I don't do anything about the similar problem when the outer iterator is implemented as a class with a __next__() method.
Indeed -- that is the key point here -- but you were pretty clear about how special casing StopIteration is a non-starter.
Well, I guess it a good thing to make things easier/clearer where you can -- even if you can't do it everywhere. I suppose if you think of generator functions as an easier way to write an iterator (where it makes sense) then this is one more thing that makes it even easier easier / safer. It does even more of the book keeping for you. So a consistent win-win. Thanks for the time taken clarifying your point. But not a big deal, agreed, probably a much smaller deal that all the other
stuff you'd better understand to write this kind of code anyway.
Which I'm sorry to see is much less widely understood than I had assumed.
Well, this PEP does make for one less detail you need to understand (or more to the point, keep in mind) when writing generator functions -- so that's a good thing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014, at 15:31, Chris Barker wrote:
For something more concrete, we can consider a naive implementation of iteration over adjacent pairs: def pairs(x): i = iter(x) while True: yield next(i), next(i)
To work under the new paradigm, you need to catch StopIteration explicitly: def pairs(x): i = iter(x) while True: try: a = next(i) b = next(i) except StopIteration: return yield a, b

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25/11/14 21:56, random832@fastmail.us wrote:
<snip>
You're right that you need to catch the StopIteration, but it seems to me the natural way to write your second example is: def pairs(x): i = iter(x) while True: try: yield next(i), next(i) except StopIteration: return Adding the extra variables a and b is unnecessary and distracts from the change actually required. Regards, Ian F -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJUdQ5lAAoJEODsV4MF7PWz/zsIALLI67/W+HAG3l0Xe+kd2/Xw QEI5NrOyT/izRHbV69K3zvOVKKCfiUXjkK5rPGxFiBmF96hOmQyro7Z4UiCSYzsT N+8dy8M6/gAWolEbD1EZoXZorNHa9nsZ8q3hBltl824CAl4Kx7FFKssUVIFWjyrD IgPjI4PIJBl12uX9F9VLMaBjfEy+QiCUa3a8s7ZdqS1asm1M4udei/qvt1t/NaIL uoYGuBO/mzxP9sdWtP4z53sX07gOMPUWdBTXFX91+G1pCaUuGCHpDcHwexatxVgk JxgfajyadFj+44QeHpsY10pu0HhaRH+Cbg7vADa2KQ6N3kQic1qHR9FkcHCvaA0= =SdtN -----END PGP SIGNATURE-----

On Wed, Nov 26, 2014 at 8:56 AM, <random832@fastmail.us> wrote:
Okay, it's simple and naive. How about this version: def pairs(x): i = iter(x) for val in i: yield val, next(i) Also simple, but subtly different from your version. What's the difference? Will it be obvious to everyone who reads it? ChrisA

On 11/25/2014 03:31 PM, Chris Angelico wrote:
I don't see the difference being subtle enough -- if an odd number of items is tossed in, that `next(i)` is still going to raise a StopIteration, which under PEP 479 will become a RunTimeError. Or did you mean that even numbered iterators will work fine, but odd-numbered ones will still raise? Nice. :) -- ~Ethan~

On Wed, Nov 26, 2014 at 11:46 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
Presumably the even case is the correct one. It's intended to work that way. If you give it an odd number of items, pre-479 they'll both silently terminate. (Post-479, the next(),next() one will always raise RuntimeError, which indicates clearly that it's not written appropriately, but that's not subtle.) ChrisA

On Tue, Nov 25, 2014 at 5:26 PM, <random832@fastmail.us> wrote:
That's not too pythonic, and trying to support non-pythonic code while evolving the language is a dead-end street. The web documents pythonic ways to obtain pairs from an iterable: def pairs(x): i = iter(x) return zip(i, i) Or even: def pairs(x): return zip(*[iter(x)]*2) The usual way of dealing with an odd number of elements is to use zip_longest. I don't remember seeing documented that raising StopIteration will cancel more than one iterator. If it's undocumented, then code that relies on the non-feature is broken. Cheers, -- Juancarlo *Añez*

On Wed, Nov 26, 2014 at 5:59 AM, <random832@fastmail.us> wrote:
Out of curiosity, what explicit uses of next are pythonic?
Ones immediately enclosed in try/except StopIteration, e.g.: try: x = next(it) print(x) except StopIteration: print('nothing') You could rewrite this particular one as follows: for x in it: print(x) break else: print('nothing') But if you have multiple next() calls you might be able to have a single try/except catching StopIteration from all of them, so the first pattern is more general. -- --Guido van Rossum (python.org/~guido)

On Wed, Nov 26, 2014 at 4:30 AM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
I don't know much about the internal details of CPython either, but let's just ignore that for the moment and consider specs for the Python language. AFAIK, not one of the concerns raised (by PEP 479 or your proposal here) is CPython-specific.
Interesting. Makes a measure of sense, and doesn't have much magic to it.
Downside of this is that it's harder to consciously chain iterators, but maybe that's a cost that has to be paid. Suggestion for this: Have a new way of "raise-and-return". It's mostly like raise, except that (a) it can't be caught by a try/except block in the current function (because there's no point), and (b) it bypasses the "this exception must not pass unnoticed". It could then also be used for anything else that needs the "return any object, or signal lack of return value" option, covering AttributeError and so on. So it'd be something like this: class X: def __iter__(self): return self def __next__(self): if condition: return value signal StopIteration The 'signal' statement would promptly terminate the function (not sure exactly how it'd interact with context managers and try/finally, but something would be worked out), and then raise StopIteration in the calling function. Any other StopIteration which passes out of a function would become a RuntimeError. Magic required: Some way of knowing which exceptions should be covered by this ban on bubbling; also, preferably, some way to raise StopIteration in the calling function, without losing the end of the backtrace. This could be a viable proposal. It'd be rather more complicated than PEP 479, though, and would require a minimum of five hundred bikeshedding posts before it comes to any kind of conclusion, but if you feel this issue is worth it, I'd certainly be an interested participant in the discussion. ChrisA

On Tue, Nov 25, 2014 at 9:48 AM, Chris Angelico <rosuav@gmail.com> wrote:
It's not viable. It will break more code than PEP 479, and it will incur a larger interpreter overhead (every time any exception bubbles out of any frame we'd have to check whether it is (derived from) StopIteration and replace it, rather than only when exiting a generator frame. (The check for StopIteration is relatively expensive -- it's easy to determine that an exception *is* StopIteration, but in order that it doesn't derive from StopIteration you have to walk the inheritance tree.) Please stop panicking. -- --Guido van Rossum (python.org/~guido)

On 15 November 2014 19:29, Chris Angelico <rosuav@gmail.com> wrote:
Thanks for the write-up! Proposal
[snip]
I think you can skip mentioning this particular idea in the PEP - I didn't like it even when I posted it, and both of Guido's ideas are much better :)
There's an additional subtlety with this idea: if we add a new GeneratorReturn exception as a subclass of StopIteration, then generator iterators would likely also have to change to replace GeneratorReturn with a regular StopIteration (chaining appropriately via __cause__, and copying the return value across).
With such a change, we would actually likely modify the following code in contextlib._GeneratorContextManager.__exit__: try: self.gen.throw(exc_type, value, traceback) raise RuntimeError("generator didn't stop after throw()") except StopIteration as exc: # Generator suppressed the exception # unless it's a StopIteration instance we threw in return exc is not value except: if sys.exc_info()[1] is not value: raise To be the slightly more self-explanatory: try: self.gen.throw(type, value, traceback) raise RuntimeError("generator didn't stop after throw()") except GeneratorReturn: # Generator suppressed the exception return True except: if sys.exc_info()[1] is not value: raise The current proposal in the PEP actually doesn't let us simplify this contextlib code, but rather means we would have to make it more complicated to impedance match generator semantics with the context management protocol. To handle that change, we'd have to make the code something like the following (for clarity, I've assumed a new RuntimeError subclass, rather than RuntimeError itself): try: self.gen.throw(exc_type, value, traceback) raise RuntimeError("generator didn't stop after throw()") except StopIteration as exc: # Could becomes "return True" once the __future__ becomes the default return exc is not value except UnexpectedStopIteration as exc: if exc.__cause__ is not value: raise except: if sys.exc_info()[1] is not value: raise I definitely see value in adding a GeneratorReturn subclass to be able to tell the "returned" vs "raised StopIteration" cases apart from outside the generator (the current dance in contextlib only works because we have existing knowledge of the exact exception that was thrown in). I'm substantially less convinced of the benefit of changing generators to no longer suppress StopIteration. Yes, it's currently a rather odd corner case, but changing it *will* break code (at the very least, anyone using an old version of contextlib2, or who are otherwise relying on their own copy of contextlib rather than standard library one). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 16, 2014 at 1:13 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Doesn't hurt to have some rejected alternates there :)
Would have to do so automatically, meaning this is no simpler than the current proposal? Or would have to be always explicitly written to handle it?
This is why it's proposed to use __future__ to protect it. If anyone's still using an old version of contextlib2 once 3.7 comes along, it'll break; but is there any reason to use Python 3.7 with a contextlib from elsewhere than its standard library? (I'm not familiar with contextlib2 or what it offers.) ChrisA

On 16 November 2014 00:37, Chris Angelico <rosuav@gmail.com> wrote:
When GeneratorReturn escaped a generator frame, the interpreter would automatically convert it into an ordinary StopIteration instance. It's still simpler because it won't need the __future__ dance (as it doesn't involve any backwards incompatible changes).
Using __future__ still imposes a large cost on the community - docs need updating, code that relies on the existing behaviour has to be changed, developers need to adjust their mental models of how the language works. There needs to be a practical payoff for those costs - and at the moment, it's looking like we can actually get a reasonably large fraction of the gain without most of the pain by instead pursuing Guido's idea of a separate StopIteration subclass to distinguish returning from the outermost generator frame from raising StopIteration elsewhere in the generator.
Same reason folks use it now: consistent behaviour and features across a range of Python versions. However, that's not the key point - the key point is that working through the exact changes that would need to be made in contextlib persuaded me that I was wrong when I concluded that contextlib wouldn't be negatively affected. It's not much more complicated, but if we can find a fully supported example like that in the standard library, what other things might folks be doing with generators that *don't* fall into the category of "overly clever code that we don't mind breaking"?
(I'm not familiar with contextlib2 or what it offers.)
contexlib2 ~= 3.3 era contextlib that runs as far back as 2.6 (I initially created it as a proving ground for the idea that eventually become contextlib.ExitStack). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 16, 2014 at 2:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Okay, let me see if I have this straight. When a 'return' statement (including an implicit one at end-of-function) is encountered in any function which contains a 'yield' statement, it is implemented as "raise GeneratorReturn(value)" rather than as "raise StopIteration(value)" which is the current behaviour. However, if any GeneratorReturn would be raised in any way other than the 'return' statement, it would magically become a StopIteration instead. Is that correct? This does sound simpler. All the magic is in the boundary of the generator itself, nothing more. If a __next__ method raises either StopIteration or GeneratorReturn, or if any other function raises them, there's no special handling. Question: How does it "become" StopIteration? Is a new instance of StopIteration formed which copies in the other's ``value``? Is the type of this exception magically altered? Or is it a brand new exception with the __cause__ or __context__ set to carry the original?
Fair enough. The breakage is a known problem, though; whatever's done is likely to cause at least some issues. If the alternate you describe above will break less (or almost none), then it'll be the best option.
Thanks, I figured it'd be like that. Since contextlib exists in 2.7, is contextlib2 meant to be legacy support only? ChrisA

On 16 November 2014 01:56, Chris Angelico <rosuav@gmail.com> wrote:
That's not quite how generators work. While the "returning from a generator is equivalent to raise StopIteration" model is close enough that it's functionally equivalent to the actual behaviour in most cases (with the main difference being in how try/except blocks and context managers inside the generator react), this particular PEP is a situation where it's important to have a clear picture of the underlying details. When you have a generator iterator (the thing you get back when calling a generator function), there are two key components: * the generator iterator object itself * the generator frame where the code is running When you call next(gi), you're invoking the __next__ method on the *generator iterator*. It's that method which restarts evaluation of the generator frame at the point where it last left off, and interprets any results. Now, there are three things that can happen as a result of that frame evaluation: 1. It hits a yield point. In that case, gi.__next__ returns the yielded value. 2. It can return from the frame. In that case. gi.__next__ creates a *new* StopIteration instance (with an appropriate return value set) and raises it 3. It can throw an exception. In that case, gi.__next__ just allows it to propagate out (including if it's StopIteration) The following example illustrates the difference between cases 2 and 3 (in both cases, there's a StopIteration that terminates the hidden loop inside the list() call, the difference is in where that StopIteration is raised):
(The possible outcomes of gi.send() and gi.throw() are the same as those of next(gi). gi.throw() has the novel variant where the exception thrown in may propagate back out) The two change proposals being discussed are as follows: Current PEP (backwards incompatible): Change outcome 3 to convert StopIteration to RuntimeError (or a new exception type). Nothing else changes. Alternative (backwards compatible): Change outcome 2 to raise GeneratorReturn instead of StopIteration and outcome 3 to convert GeneratorReturn to StopIteration. The alternative *doesn't* do anything about the odd discrepancy between comprehensions and generator expressions that started the previous thread. It just adds a new capability where code that knows it's specifically dealing with a generator (like contextlib or asyncio) can more easily tell the difference between outcomes 2 and 3.
All the magic is actually at the generator boundary regardless. The key differences between the two proposals are the decision to keep StopIteration as a common parent exception, and allow it to continue propagating out of generator frames unmodified.
I'd suggest used the exception chaining machinery and creating a new exception with __cause__ and the generator return value set appropriately.
contextlib has actually been around since 2.5, but some features (most notably ExitStack) weren't added until much later. Like unittest2, contextlib2 allows access to newer stdlib features on older versions (I haven't used it as a testing ground for new ideas since ExitStack). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 16, 2014 at 3:51 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thank you for explaining. -- Cameron In case others were also oversimplifying in their heads, I've summarized the above into the PEP.
Should that variant affect this proposal? What should happen if you throw StopIteration or GeneratorReturn into a generator?
Text along these lines added to PEP, thanks!
Makes sense. If the __cause__ is noticed at all (ie this doesn't just quietly stop a loop), it wants to be very noisy.
If there is breakage from this, it would simply mean "older versions of contextlib2 are not compatible with Python 3.7, please upgrade your contextlib2" - several of the variants make it perfectly possible to write cross-version-compatible code. I would hope that this remains the case. ChrisA

On Sun, Nov 16, 2014 at 3:51 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thank you for explaining. -- Cameron In case others were also oversimplifying in their heads, I've summarized the above into the PEP.
Should that variant affect this proposal? What should happen if you throw StopIteration or GeneratorReturn into a generator?
Text along these lines added to PEP, thanks!
Makes sense. If the __cause__ is noticed at all (ie this doesn't just quietly stop a loop), it wants to be very noisy.
If there is breakage from this, it would simply mean "older versions of contextlib2 are not compatible with Python 3.7, please upgrade your contextlib2" - several of the variants make it perfectly possible to write cross-version-compatible code. I would hope that this remains the case. Latest version of PEP text incorporating the above changes: https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-0479.txt (My apologies if this email has gone through more than once. I'm having major issues with my internet connection at the moment, and delivery is failing and being retried. Hopefully it really *is* failing, and not just saying so.) ChrisA

Since this changes the behavior of an object instance, how can __future__ help? If the generator definition is in a library but the code that raises StopIteration to terminate it is passed in from the users code, how is the user supposed to select the behavior they want? (This sounds to me like a similar problem to adding 'from __future__ import py3_string' to Py2, which we discussed a while ago. Happy to be shown that it isn't.) Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Chris Angelico<mailto:rosuav@gmail.com> Sent: 11/15/2014 1:30 To: python-ideas<mailto:python-ideas@python.org> Subject: [Python-ideas] PEP 479: Change StopIteration handling inside generators PEP: 479 Title: Change StopIteration handling inside generators Version: $Revision$ Last-Modified: $Date$ Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 15-Nov-2014 Python-Version: 3.5 Post-History: 15-Nov-2014 Abstract ======== This PEP proposes a semantic change to ``StopIteration`` when raised inside a generator, unifying the behaviour of list comprehensions and generator expressions somewhat. Rationale ========= The interaction of generators and ``StopIteration`` is currently somewhat surprising, and can conceal obscure bugs. An unexpected exception should not result in subtly altered behaviour, but should cause a noisy and easily-debugged traceback. Currently, ``StopIteration`` can be absorbed by the generator construct. Proposal ======== If a ``StopIteration`` is about to bubble out of a generator frame, it is replaced with some other exception (maybe ``RuntimeError``, maybe a new custom ``Exception`` subclass, but *not* deriving from ``StopIteration``) which causes the ``next()`` call (which invoked the generator) to fail, passing that exception out. From then on it's just like any old exception. [3]_ Consequences to existing code ============================= This change will affect existing code that depends on ``StopIteration`` bubbling up. The pure Python reference implementation of ``groupby`` [1]_ currently has comments "Exit on ``StopIteration``" where it is expected that the exception will propagate and then be handled. This will be unusual, but not unknown, and such constructs will fail. (Nick Coghlan comments: """If you wanted to factor out a helper function that terminated the generator you'd have to do "return yield from helper()" rather than just "helper()".""") As this can break code, it is proposed to utilize the ``__future__`` mechanism to introduce this, finally making it standard in Python 3.6 or 3.7. Alternate proposals =================== Supplying a specific exception to raise on return ------------------------------------------------- Nick Coghlan suggested a means of providing a specific ``StopIteration`` instance to the generator; if any other instance of ``StopIteration`` is raised, it is an error, but if that particular one is raised, the generator has properly completed. Making return-triggered StopIterations obvious ---------------------------------------------- For certain situations, a simpler and fully backward-compatible solution may be sufficient: when a generator returns, instead of raising ``StopIteration``, it raises a specific subclass of ``StopIteration`` which can then be detected. If it is not that subclass, it is an escaping exception rather than a return statement. Criticism ========= Unofficial and apocryphal statistics suggest that this is seldom, if ever, a problem. [4]_ Code does exist which relies on the current behaviour, and there is the concern that this would be unnecessary code churn to achieve little or no gain. References ========== .. [1] Initial mailing list comment (https://mail.python.org/pipermail/python-ideas/2014-November/029906.html) .. [2] Pure Python implementation of groupby (https://docs.python.org/3/library/itertools.html#itertools.groupby) .. [3] Proposal by GvR (https://mail.python.org/pipermail/python-ideas/2014-November/029953.html) .. [4] Response by Steven D'Aprano (https://mail.python.org/pipermail/python-ideas/2014-November/029994.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Exactly! The current behavior is not only likely undesirable, but it is also undocumented. Even if parts of stdlib rely on the current behavior, there's no need for a deprecation (read __future__) period. Undocumented features may change any time, because are mostly about implementation quirks (Isn't that rule documented somewhere in the Python docs?). In short: -1 deprecation (__future__); no need, because nothing documented gets broken +1 fix it now (3.5); the fix may be a change in the docs to validate the current behavior, and deprecate it (Yuk!) +1 Nick's design, which kind of leaves it the same and kind of fixes it p.s. What about 2.7? This fix is *not* a new feature. Cheers, -- Juanca On Sat Nov 15 2014 at 1:50:06 PM Steve Dower <Steve.Dower@microsoft.com> wrote:

On Sun, Nov 16, 2014 at 8:00 AM, Juancarlo Añez <apalala@gmail.com> wrote:
I'm not sure about that. As Steven said, the current behaviour is simple: 1) When 'yield' is reached, a value is yielded. 2) When 'return' is reached, StopIteration is raised. 3) When an exception is raised, it is permitted to bubble up. Whether that is *correct* or not is the point of this PEP, but it is at least simple, and while it may not be documented per se, changing it is likely to break code.
Maybe not, but let's get the proposal settled before figuring out how much deprecation period is needed.
That would be pretty much what happens if the PEP is rejected: the current behaviour will be effectively validated (at least to the extent of "it's not worth the breakage").
p.s. What about 2.7? This fix is *not* a new feature.
That ultimately depends on the release manager, but I would not aim this at 2.7. Nick's proposal introduces a new exception type, which I think cuts this out of 2.7 consideration right there; both active proposals involve distinct changes to behaviour. I believe both of them require *at a minimum* a feature release, and quite probably a deprecation period (although that part may be arguable, as mentioned above). ChrisA

On Sun, Nov 16, 2014 at 5:20 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
The behaviour selection would have to be based on the generator's definition. This proposal, in all its variants, is about what happens as the generator terminates; if you call on someone else's generator, and that someone hasn't applied the __future__ directive, you'll be in the current situation of not being able to distinguish 'return' from 'raise StopIteration'. But for your own generators, you can guarantee that they're distinct. ChrisA

On Mon, Nov 17, 2014 at 11:05:01AM +1300, Greg Ewing wrote:
I don't see how that is different from any other __future__ directive. They are all per-module, and if you gain access to an object from another module, it will behave as specified in the module that created it, not the module that imported it. How is this different? -- Steven

On Mon, Nov 17, 2014 at 11:03 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Well, let's see. For feature in sorted(__future__.all_feature_names): absolute_import: Affects implementation of a keyword barry_as_FLUFL: Not entirely sure what this one actually accomplishes. :) division: Changes the meaning of one operator. generators: Introduces a keyword nested_scopes: Alters the compilation of source to byte-code(?) print_function: Removes a keyword unicode_literals: Alters the type used for literals with_statement: Introduces a keyword Apart from the joke, it seems that every __future__ directive is there to affect the compilation, not execution, of its module: that is, once a module has been compiled to .pyc, it shouldn't matter whether it used __future__ or not. Regardless of unicode_literals, you can create bytes literals with b'asdf' and unicode literals with u'asdf'. I'm not entirely sure about division (can you call on true-division without the future directive?), but in any case, it's all done at compilation time, as can be seen interactively: Python 2.7.3 (default, Mar 13 2014, 11:03:55) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.
So to make this consistent with all other __future__ directives, there would need to be some kind of safe way to define this: perhaps an attribute on the generator object. Something like this:
The attribute on the function would be what affects behaviour; the __future__ directive applies that attribute to all generator functions in its module (including genexprs). Once the __future__ directive becomes automatic, the attribute can and will be dropped - any code which interrogates it MUST be prepared to stop interrogating it once the feature applies to all modules. Does that sound reasonable? Should it be added to the PEP? ChrisA

(I'm catching up on this thread from the end.) On Sun, Nov 16, 2014 at 5:29 PM, Chris Angelico <rosuav@gmail.com> wrote:
I agree with you and Steven that this is a fine use of __future__. What a generator does with a StopIteration that is about to bubble out of its frame is up to that generator. I don't think it needs to be a flag on the *function* though -- IMO it should be a flag on the code object. (And the flag should somehow be transferred to the stack frame when the function is executed, so the right action can be taken when an exception is about to bubble out of that frame.) One other small point: let's change the PEP to just propose RuntimeError, and move the "some other exception" to the "rejected ideas" section. -- --Guido van Rossum (python.org/~guido)

On Mon, Nov 17, 2014 at 12:58 PM, Guido van Rossum <guido@python.org> wrote:
Changes incorporated, thanks! I'm not familiar with the details of stack frame handling, so I've taken the cop-out approach and just quoted you directly into the PEP. PEP draft: https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-0479.txt GitHub hosted repo, if you want to follow changes etc: https://github.com/Rosuav/GenStopIter ChrisA

On Mon, Nov 17, 2014 at 8:26 PM, Georg Brandl <g.brandl@gmx.net> wrote:
Thanks Georg! This means today's version is now visible here: http://legacy.python.org/dev/peps/pep-0479/ ChrisA

On Mon, Nov 17, 2014 at 2:27 AM, Chris Angelico <rosuav@gmail.com> wrote:
Off-topic: the new python.org site now supports PEPs, so please switch to URLs like this: https://www.python.org/dev/peps/pep-0479/ (if you don't like the formatting send a pull request to https://github.com/python/pythondotorg). -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 18, 2014 at 4:28 AM, Guido van Rossum <guido@python.org> wrote:
Oh, nice. Google searches for PEPs still find legacy rather than that, so it may be worth setting legacy to redirect to www. Formatting looks fine, except that the subheadings look bolder than the main; I'll check for issues and post one, though probably not a PR. ChrisA

On 17/11/2014 2:29 p.m., Chris Angelico wrote:
barry_as_FLUFL: Not entirely sure what this one actually accomplishes. :)
It determines whether "not equal" is spelled "!=" or "<>", so it fits the pattern of being compile-time-only. -- Greg

On 17 November 2014 13:34, Chris Angelico <rosuav@gmail.com> wrote:
True division (in Python 2) is a nice simple one to look at, since it just swaps one bytecode for another (BINARY_DIVIDE -> BINARY_TRUE_DIVIDE)
The compiler actually stores a whole pile of useful info on code objects that doesn't show up in the disassembly output (switching to Python 3 for more up to date dis module goodness):
So conveying to the generator iterator whether or not "from __future__ import generator_return" was in effect would just be a matter of the compiler setting a new flag on the generator code object. For *affected generators* (i.e. those defined in a module where the new future statement was in effect), StopIteration escaping would be considered a RuntimeError. For almost all code, such RuntimeErrors would look like any other RuntimError raised by a broken generator implementation. The only code which would *have* to change immediately as a "Porting to Python 3.5" requirement is code like that in contextlib, which throws StopIteration into generators, and currently expects to get it back out unmodified. Such code will need to be updated to also handle RuntimError instances where the direct cause is the StopIteration exception that was thrown in. Other affected code (such as the "next() bubbling up" groupby example) would keep working unless the __future__ statement was in effect. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Nov 17, 2014 at 4:56 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thanks, this is exactly what I was thinking of. The new flag could be named REPLACE_STOPITERATION. Then the __future__ import could be named replace_stopiteration_in_generators (it needs more description than the flag name because the flag is already defined in the context of a generator, while the __future__ import must still establish that context). -- --Guido van Rossum (python.org/~guido)

On Sun, Nov 16, 2014 at 11:49 AM, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
Agreed. And agreed on the analysis; I can't add examples till I know for sure what I'm adding examples _of_. The latest edit expanded on the details of the proposals, so it now may be possible to consider examples, but possibly we're still bikeshedding the nature of the proposals themselves. Correction. We're DEFINITELY still bikeshedding etc etc, but possibly we're still doing so to the extent that it's not worth adding examples yet. :) ChrisA

On Sat, Nov 15, 2014 at 1:29 AM, Chris Angelico <rosuav@gmail.com> wrote:
Specifically it's absorbed by the caller of the generator, because the caller doesn't know the difference between next(x) raising StopIteration because the iterator specifically wants to stop, vs because of accident. As another alternative, how about a new iterator protocol that is defined without this ambiguity? Code at the bottom of my post to help explain: define a new method __nextx__ which doesn't use StopIteration for any signalling, instead, it returns None if there are no values to return, and returns a special value Some(v) if it wants to return a value v. Both next(it) and nextx(it) are made to work for any iterator that is defined using either protocol, but for loops and Python builtins all use nextx internally. Generators define __next__ unless you from __future__ import iterators, in which case they define __nextx__ instead. In this way, old code can't tell the difference between accidental StopIteration and deliberate StopIteration, but new code (using nextx instead of next, and using __future__ import'd generators) can. No backwards incompatibility is introduced, and you can still insert StopIteration into a generator and get it back out -- using both next() where it is ambiguous and nextx() where it is not. Yes, it's ugly to have two different iterator protocols, but not that ugly. In fact, this would be Python's third (I have omitted that third protocol in the below example, for the sake of clarity). I find the proposed solution more scary, in that it's sort of a "hack" to get around an old mistake, rather than a correction to that mistake, and it introduces complexity that can't be removed in principle. (Also, it's very unusual.) class Some: def __init__(self, value): self.value = value def next(it): v = nextx(it) if v is None: raise StopIteration return v.value def nextx(it): if hasattr(it, '__nextx__'): v = it.__nextx__() if v is None or isinstance(v, Some): return v raise TypeError("__nextx__ must return Some(...) or None, not %r" % (v,)) if hasattr(it, '__next__'): try: return Some(it.__next__()) except StopIteration: return None raise TypeError -- Devin

On Tue, Nov 18, 2014 at 12:50 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
I had actually contemplated adding a "what if __next__ returned a sentinel instead of raising an exception" possibility to the PEP, if only for completeness. Since someone else has suggested it too now, it may be worth doing. Rather than a wrapper around every returned value, what I'd be inclined toward is a special sentinel that otherwise cannot be returned. This could be a dedicated, arbitrary object, or something globally unique, or something locally unique. One option that comes to mind is to have the generator return _itself_ to signal that it's returned. I don't think this option will be better than the current front runners, but would you like me to add it for completeness? The biggest downside is that it might give a false positive; you can't, for instance, have an iterator "all_objects()" which returns, like the name says, every object currently known to Python. (I don't know that CPython is capable of implementing that, but there's no reason another Python couldn't, and it might be useful.) I expect that's why the exception system was used instead; can anyone confirm that? ChrisA

I don't want to contemplate a new __next__ protocol. The existing protocol was carefully designed and tuned to have minimal memory overhead (even to the point where the exception instance returned may be reused). Wapping each result would just result in an extra allocation + deallocation per iteration, unless you can play games with reference counts or do something else to complicate the semantics). Introducing __nextx__ would require thousands of libraries implementing this to incur churn as they feel the pressure to switch to the new protocol, and the compatibility issue would be felt everywhere. The problem we're trying to fix is unique to generators (thereby also implicating generator expressions). On Mon, Nov 17, 2014 at 6:04 AM, Chris Angelico <rosuav@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Mon, Nov 17, 2014 at 9:40 AM, Guido van Rossum <guido@python.org> wrote:
This sounds totally reasonable to me.
The problem we're trying to fix is unique to generators (thereby also implicating generator expressions).
I suppose since you're only fixing generators, then that is literally the only problem you are trying to fix, but it is more general than that. I have encountered this sort of problem writing __next__ by hand in Python -- that is, that bugs inside code I call result in silent control flow changes rather than a visible exception. -- Devin

On Mon, Nov 17, 2014 at 9:53 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote: [Guido]
I assume this is something where the __next__() method on your iterator class calls next() on some other iterator and accidentally doesn't catch the StopIteration coming out of it (or, more likely, this happens several calls deep, making it more interesting to debug). That particular problem is not unique to __next__ and StopIteration -- the same thing can (and does!) happen with __getitem__ and KeyError or IndexError, and with __getattr[ibute]__ and AttributeError. In all these cases I think there isn't much we can do apart from adding lint rules. If you are writing __next__ as a method on an iterator class, one way or another you are going to have to raise StopIteration when there isn't another element, and similarly __getitem__ has to raise KeyError or IndexError, etc. In the generator case, we have a better way to signal the end -- a return statement (or falling off the end). And that's why we can even contemplate doing something different when StopIteration is raised in the generator. -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 18, 2014 at 4:53 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
If you're writing __next__ by hand, there's nothing anyone else can do about a bubbled-up StopIteration. What you can do is wrap your code in try/except: def __next__(self): try: # whatever except StopIteration: raise RuntimeError raise StopIteration If your "whatever" section returns a value, that's what the result will be. If it fails to return a value, StopIteration will be raised. And if StopIteration is raised, it'll become RuntimeError. But this has to be inside __next__. This can't be done externally. (Note, untested code. May have bugs.) ChrisA

On Tue, Nov 18, 2014 at 4:40 AM, Guido van Rossum <guido@python.org> wrote:
Wapping each result would just result in an extra allocation + deallocation per iteration...
Which is why I would be more inclined to use a sentinel of some sort... but that has its own problems. There's no perfect solution, so status quo wins unless a really compelling case can be made. I could toss something into the Alternate Proposals section, but I wouldn't be personally supporting it. ChrisA

On Mon, Nov 17, 2014 at 10:09 AM, Chris Angelico <rosuav@gmail.com> wrote:
Trust me, we went down this rabbit hole when we designed the iterator protocol. The problem is that that sentinel object must have a name (otherwise how would you know when you had seen the sentinel), which means that there is at least one dict (the namespace defining that name, probably the builtins module) that has the sentinel as one of its values, which means that iterating over that particular dict's values would see the sentinel as a legitimate value (and terminate prematurely). You could fix this by allocating a unique sentinel for every iteration, but there would be some additional overhead for that too (the caller and the iterator both need to hold on to the sentinel object). In any case, as I tried to say before, redesigning the iterator protocol is most definitely out of scope here (you could write a separate PEP and I'd reject it instantly). -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 18, 2014 at 5:17 AM, Guido van Rossum <guido@python.org> wrote:
That's a much more solid argument against it than I had. (Also amusing to try to contemplate.) The iterator protocol absolutely demands a completely out-of-band means of signalling "I have nothing to return now".
That's a PEP I'll leave for someone else to write. :) ChrisA

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/17/2014 10:23 AM, Chris Angelico wrote:
On the other hand, if you did write it (now), and Guido rejected it (of course), then that would mean PEP 479 is sure to be accepted! Third time's the charm! ;) - -- ~Ethan~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJUaj3uAAoJENZ7D1rrH75NzRYP/2NcwwNOQvD71RlhJG0k0sUY xLgjzLSCVQbR1jliF1Ej9BEZhFhd1I4lg/msjDPNBSHABPtsbrRTzZiac4x3OAEz 5L5qCB7hIaHg+gDKEbCiOl5NAZUbZL+Qs1tFdlSjeCti/YUAVE1DiD4WQ+kXHxZ4 OF9Nj5urHrGQYh//HR3RryQddihPBkM9Y40oucRwQE32jzcJacgmEvOM4aVvJHb8 2rBfl7VjpHwC2dV51R3cYpwIu2Qqk9i4kEnkB9xDtj9HQl3YT+qEyAuZWOs7v3cH cJOjBuIxJGYF/CbjRE5/mCFegvGv4Lf2C8QSeKlqR549IPhso+qcnLvTQNdBWer0 g9L8Gsk1whUR9GQ3HGaupAkAo8rbZKM9fjvEtnPvcWAPcGZDfeFG5MEl82SCOsLS aUl+v5u2sVozxIsY9fr8s30X2HZReaVjuJFDLCosZmi3RcqYL7O0x5+Mo4GPgsbi YxyCYU0c0LTTk7yljweYL/oWXyUS/hAxz2VNMyAjDNw1Uy1kWwS6qAW5FcMxfdlz I92CBx0PXPLz1hhHYsJ/YnHIJpL9+HRHHRRdJOnHOGCyb7AX6jN3ruMGYaTClrYa G2ez3JMeu0NhTuzpFF2mjO0PXTu/HuHq3bdv44jhZS0Zn67OQmNk845bZi5U3EaW oJTRKMM/PTZNpspQE0/n =ZYwx -----END PGP SIGNATURE-----

FWIW, I spent some time this morning close-reading the PEP, and made a somewhat significant set of updates -- adding more specifics about the proposed new __futue__ statement and the new code object flag, tracking down a few more examples of code that would be affected, and some other minor edits. Here's the diff: https://hg.python.org/peps/rev/8de949863677 Hopefully the new version will soon be here: https://www.python.org/dev/peps/pep-0479 Note that I am definitely not yet deciding on this PEP. I would love it if people sent in examples of code using generator expressions that would be affected by this change (either by highlighting a bug in the code or by breaking what currently works). If this PEP gets rejected, we could resurrect the GeneratorExit proposal currently listed as an alternative -- although the more I think about that the less I think it's worth it, except for the very specific case of asyncio (thinking of which, I should add something to the PEP about that too). -- --Guido van Rossum (python.org/~guido)

On Mon, Nov 17, 2014 at 11:38:27AM -0800, Guido van Rossum wrote:
Over a week ago I raised this issue on python-list mailing list. I expected a storm of bike-shedding, because that's the sort of place p-l is :-) but got just two people commenting. The thread, for anyone interested: https://mail.python.org/pipermail/python-list/2014-November/680757.html One response suggested that it is not generators which do the wrong thing, but comprehensions, and that comprehensions should be changed to behave like generators: https://mail.python.org/pipermail/python-list/2014-November/680758.html That should probably be put in the PEP, even if it is not an option being considered, it at least evidence that "some people" find the behaviour of generators more natural than that of comprehensions. -- Steve

Nick, I think we've gone through enough clarifications of the PEP now to be clear on the proposal. I saw in one of your earliest replies (right after Chris posted his first draft) that you're hesitant to support the PEP because of what would have to change to contextlib. What I couldn't quite read is whether you think that the proposal by itself is not an improvement, or whether you're just worried about compatibility. Apparently you know of a large group of users who use an older 3rd party version of contextlib, and for whom that older, 3rd party contextlib should keep working with future versions of Python 3 without updating their version of contextlib -- did I get that right? What exactly is the constraint there that makes their version of contextlib immutable even though the version of Python they are using may move forward? Separate from this special case, I am also worried about backward compatibility, and I have yet to get a good idea for how widespread code is that depends on StopIteration bubbling out from generators. I also don't have a good idea how often this issue bites users, but I have a feeling it does bite. E.g. this quote from c.l.py ( https://mail.python.org/pipermail/python-list/2014-November/680775.html): """ I did find it annoying occasionally that raising StopIteration inside a generator expression conveys a different behavior than elsewhere. It did take me quite a while to understand why that is so, but after that it did not cause me much of a headache anymore. """ -- --Guido van Rossum (python.org/~guido)

On 11/18/2014 05:50 AM, Guido van Rossum wrote:
I just remembered one use of the current behavior. Two years ago or so, I was suggesting on this list a possibility for early termination of comprehensions when a particular value is encountered. In other words, an equivalent to: l = [] for x in seq: if x == y: break l.append(x) At the time, somebody suggested (roughly): def stop(): raise StopIteration l = list(x for x in seq if x!=y or stop()) which, for the very reasons discussed in this thread, works only as a generator expression and not in comprehension form. I used this solution in some not particularly important piece of code so I wouldn't despair if it wasn't compatible with the next release of the language. Also, I have a feeling that some of you may consider this sort of a hack in the first place. Just thought I'd mention it here for completeness. Wolfgang

On 11/18/2014 12:40 PM, Wolfgang Maier wrote:
I believe I thought then that one should write the explicit loop rather than overload the 'comprehension' concept.
If stop is defined in another file, such as 'utility', this is a bit nasty. A new maintainer comes along and changes that to a list comprehension, or perhaps decides a set rather than a list is needed, and changes it to a set comprehension instead of set() call and bingo!, a bug. Or someone imitates the pattern, but with [] instead of list.
which, for the very reasons discussed in this thread, works only as a generator expression and not in comprehension form.
With this example, where the StopIteration source could be much more obscure than next(), I now understand Guido's concern about hard-to-understand bugs. From a maintainability view, it should not matter if one calls a function on a naked comprehension (making it a genexp) or uses the
-- Terry Jan Reedy

On Wed, Nov 19, 2014 at 9:54 AM, Terry Reedy <tjreedy@udel.edu> wrote:
I'm not sure about that. Comprehensions can already be filtered; is it such a jump from there to a "filter" that aborts on a certain condition? It may not be language-supported, but I don't see that it's illogical; and any use of a loop that appends to a list is rightly considered code smell. ChrisA

On Wed, Nov 19, 2014 at 10:08:29AM +1100, Chris Angelico wrote:
It certainly isn't. It's an obvious extension to the concept: terminate the loop rather than filter it. At least two languages support early termination: http://clojuredocs.org/clojure_core/clojure.core/for http://docs.racket-lang.org/guide/for.html and it keeps getting asked for: http://www.reddit.com/r/Python/comments/ciec3/is_there_anything_like_a_list_... http://stackoverflow.com/questions/5505891/using-while-in-list-comprehension... http://stackoverflow.com/questions/16931214/short-circuiting-list-comprehens... https://www.daniweb.com/software-development/python/threads/293381/break-a-l... https://mail.python.org/pipermail/python-ideas/2014-February/026036.html https://mail.python.org/pipermail/python-ideas/2013-January/018969.html There's a rejected PEP: https://www.python.org/dev/peps/pep-3142/ and alternative solutions (write an explicit generator function, use itertools.takewhile). So there's obviously a need for this sort of thing, and (expr for x in iterable if cond() or stop()) seems to be a common solution. I'm not sure if that's a neat trick or a deplorable hack :-) but either way this PEP will break code using it.
and any use of a loop that appends to a list is rightly considered code smell.
I'm afraid I don't understand that comment. Why is appending to a list inside a loop a code smell? That's exactly what list comps do. -- Steven

On Wed, Nov 19, 2014 at 12:15 PM, Steven D'Aprano <steve@pearwood.info> wrote:
That's precisely why. If I write code like this: l = [] for i in something: l.append(func(i)) then I should rework it into a comprehension. Having a filter doesn't change that: l = [] for i in something: if i: l.append(func(i)) That's still possible with a list comp, and should be rewritten as one. But having a break in there *does* change it, because there's no way in the language to do that. The question is: Is it better to abuse StopIteration or to turn the list comp back into an explicit loop? And if anyone chose the former, their code will break. ChrisA

On Tue, Nov 18, 2014 at 6:01 PM, Chris Angelico <rosuav@gmail.com> wrote:
Not everything you do with an explicit loop can be done with a comprehension, and that's by design. Comprehensions should be easier to reason about than code using for-loops. And generator expressions should work the same way, except for producing results in a lazy fashion. The StopIteration hack breaks this equivalence and hampers the ability to reason, since you can't tell whether a predicate might raise StopIteration. It was never my intention that generator expressions behaved this way -- it was an accidental feature that surprised me when it was first shown to me, and I've never gotten used to it. (And I don't care whether you say it is "obvious", call it "stop()", and only use it in an "idiomatic" fashion -- it's still a surprise for anyone who has to debug code involving it.) The only thing standing in the way of fixing this is the recognition that there may be a fair amount of code out there that depends on this hack, and which will have to be rewritten. -- --Guido van Rossum (python.org/~guido)

On Wed, Nov 19, 2014 at 1:56 PM, Guido van Rossum <guido@python.org> wrote:
Has anyone come across any more non-trivial examples? We have contextlib (in the standard library) and contextlib2 (third-party), plus a number of StackOverflow posts and such. Are there any other known pieces of code that would be seriously hampered by this change? ChrisA

On Tue, Nov 18, 2014 at 7:37 PM, Chris Angelico <rosuav@gmail.com> wrote:
One possible way to find out would be to write a simple version of a patch (maybe one that doesn't use __future__ but just always converts StopIteration to RuntimeError when it bubbling out of a generator frame) and run the stdlib tests, then see how many tests this breaks. (I understand if you don't want to write it. But maybe someone does. :-) -- --Guido van Rossum (python.org/~guido)

On Wed, Nov 19, 2014 at 3:22 PM, Guido van Rossum <guido@python.org> wrote:
I poked around a bit in the code and managed to come up with this. It doesn't chain the previous exception, so the traceback is a little scanty, but it does turn a StopIteration into a RuntimeError. (It might also leak the original StopIteration. I'm not sure.) Prior to this patch, I had 377 of 390 tests passing flawlessly and no failures (just skips and warnings); with this applied, six failures. diff -r 23ab1197df0b Objects/genobject.c --- a/Objects/genobject.c Wed Nov 19 13:21:40 2014 +0200 +++ b/Objects/genobject.c Thu Nov 20 13:43:44 2014 +1100 @@ -130,6 +130,14 @@ } Py_CLEAR(result); } + else if (!result) + { + if (PyErr_ExceptionMatches(PyExc_StopIteration)) + { + PyErr_SetString(PyExc_RuntimeError, + "generator raised StopIteration"); + } + } if (!result || f->f_stacktop == NULL) { /* generator can't be rerun, so release the frame */ However, I'm not sure about setting the context. In errors.c is a function _PyErr_ChainExceptions which appears to do a similar job, so I imitated its code. Here's the result: else if (!result) { if (PyErr_ExceptionMatches(PyExc_StopIteration)) { PyObject *exc, *val, *val2, *tb; PyErr_Fetch(&exc, &val, &tb); PyErr_NormalizeException(&exc, &val, &tb); Py_DECREF(exc); Py_XDECREF(tb); PyErr_SetString(PyExc_RuntimeError, "generator raised StopIteration"); PyErr_Fetch(&exc, &val2, &tb); PyErr_NormalizeException(&exc, &val2, &tb); PyException_SetContext(val2, val); PyErr_Restore(exc, val2, tb); } } The context is being set, but without a traceback. ############# def good_gen(): yield 1 return 2 def evil_gen(): yield 1 raise StopIteration(2) # In absence of PEP 479 changes, the above two should be virtually indistinguishable. print("Starting.") good = tuple(good_gen()) print("Good:", good, good == (1,)) evil = tuple(evil_gen()) print("Evil:", evil, evil == (1,)) ############# Starting. Good: (1,) True StopIteration: 2 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "../test_pep0479.py", line 14, in <module> evil = tuple(evil_gen()) RuntimeError: generator raised StopIteration What am I missing here? Do I need to force something to construct a full traceback before it can show the line number that actually raised StopIteration? ChrisA

On Thu, Nov 20, 2014 at 2:44 PM, Chris Angelico <rosuav@gmail.com> wrote:
With the attached demo patch, all tests pass except test_generators, which explicitly tests stuff about the correlation between return and StopIteration. There's the contextlib changes, a couple of places that were raising StopIteration and should be returning, and a couple that were letting StopIteration bubble and now need to catch it and return. I've deliberately not followed PEP 8 here, in the interests of minimizing diff size; in several cases, blocks of code ought to be indented a level, but I cheated and created a half-indentation to show how little actually changes. If anyone would care to try this on their own codebases, that'd be helpful. ChrisA

On Thu, Nov 20, 2014 at 9:13 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 11/17/2014 08:50 PM, Guido van Rossum wrote:
One argument for making the change*: When we're writing __next__, or __getattr__, etc., it is obvious that we are playing with internals and have to be extra careful of what other exceptions might be raised in that code. Contrariwise, the only indication of something special about a generator is the presence of the yield keyword -- for ordinary use (such as in for loops) it doesn't matter whether the called function returns a list, tuple, iterator, generator, or whatever, as long as it can be iterated over, and so when writing a generator, or converting an iterable-returning function into a generator, there's nothing obvious saying, "Hey! Watch out for a StopIteration somewhere else in this block of code!" * I make no statement as to how strong this argument is, but there you have it. :) -- ~Ethan~

On 18 November 2014 14:50, Guido van Rossum <guido@python.org> wrote:
I think it's an improvement - I really like the fact that it brings generators into line with your reasoning in the with statement PEP that flow control constructs should be locally visible. At the moment, "raise StopIteration" in a generator context is effectively non-local flow control, as it means any function call (explicit or implicit) or yield point may gracefully stop generator execution, rather than only return statements. the StopIteration instance you threw in. You can construct scenarios where such a check will give a false positive, but they're getting seriously contrived at that point. That's obscure enough that I think it's on par with other behavioural tweaks we've included in the "Porting to Python X.Y" guides in the past.
I don't even remember how that came up now, but it was entirely hypothetical, and I think it can be ignored as a concern. As you say, being able to update to Python 3.5 without also being able to update to a new version of contextlib2 would just be weird. Even if such a strange scenario did somehow come up, it would still be possible to switch to a conditional import where they used the stdlib version if available, and only fell back to contextlib2 on earlier versions of Python.
One advantage of needing a __future__ import on the generator author side is that you always have the option of changing your mind, and *never* making the new behaviour the default. That wouldn't be a *good* outcome, but I don't think it would be intolerable. OTOH, I'm also not sure the status quo is sufficiently problematic to be worth changing. Yes, it's a little weird, but is it *that* much weirder than the unavoidable issues with exceptions thrown in __next__, __getitem__, __getattr__ and other special methods where a particular type of exception is handled directly by the interpreter? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Nov 20, 2014 at 3:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If you write __next__, you write in a "raise StopIteration" when it's done. If you write __getattr__, you write in "raise AttributeError" if the attribute shouldn't exist. Those are sufficiently explicit that it should be reasonably clear that the exception is the key. But when you write a generator, you don't explicitly raise: def gen(): yield 1 yield 2 yield 3 return 4 The distinction in __next__ is between returning something and raising something. The distinction in a generator is between "yield" and "return". Why should a generator author have to be concerned about one particular exception having magical meaning? Imagine this scenario: def producer(): """Return user input, or raise KeyboardInterrupt""" return input("Enter the next string: ") def consumer(): """Process the user's input""" while True: try: command = producer() except KeyboardInterrupt: break dispatch(command) Okay, now let's make a mock producer: strings = ["do stuff","do more stuff","blah blah"] def mock_producer() if strings: return strings.pop(0) raise KeyboardInterrupt That's how __next__ works, only with a different exception, and I think people would agree that this is NOT a good use of KeyboardInterrupt. If you put a few extra layers in between the producer and consumer, you'd be extremely surprised that an unexpected KeyboardInterrupt just quietly terminated a loop. Yet this is exactly what the generator-and-for-loop model creates: a situation in which StopIteration, despite not being seen at either end of the code, now has magical properties. Without the generator, *only* __next__ has this effect, and that's exactly where it's documented to be. Does that make for more justification? Unexpected exceptions bubbling up is better than unexpected exceptions quietly terminating loops? ChrisA

On 20 November 2014 02:24, Chris Angelico <rosuav@gmail.com> wrote:
Does that make for more justification? Unexpected exceptions bubbling up is better than unexpected exceptions quietly terminating loops?
The part I found most compelling was when you pointed out that in the special method implementations, the normal return path was always spelled with "return", while the "value missing" result was indicated with a special kind of exception (StopIteration, AttributeError, IndexError or KeyError), and then any other exception was consider unexpected. Generators add the third notion of being able to suspend execution via "yield", which then left them with two different ways of spelling termination inside the frame: "return" OR "raise StopIteration". The second spelling ("raise StopIteration") is then inherently surprising, as it's entirely redundant, *except* in that it allows you to effectively have a "hidden return" in a generator frame that can't be done anywhere else. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Nov 20, 2014 at 02:45:27AM +1000, Nick Coghlan wrote:
I'm not sure that many people outside of this and the python-dev mailing lists would find the use of "raise StopIteration" surprising. Rather, I expect that they will find the use of an explicit "return" inside a generator surprising. People are informally taught that generators use yield *instead of* return, so seeing both in the same function is a surprise. (Most generators quitely fall out the bottom with no explicit end.) I don't claim that doing so is Pythonic or even good practice, but I am sure that there are a lot of people who believe that raising StopIteration to exit a generator is (1) supported and (2) preferred. Examples of code in the wild using StopIteration to exit: http://code.openhub.net/file?fid=ezlejSoT2q7PWrhgNkpdU55MWOA&cid=jVcYOxnQhvU&s=raise%20StopIteration&fp=301369&mp&projSelected=true#L0 http://code.openhub.net/file?fid=M0gWWCpn-avqHO_jnsYcG2T81lg&cid=VKn_M0_GgKM&s=raise%20StopIteration&fp=301283&mp&projSelected=true#L0 http://code.openhub.net/file?fid=pDrrTI8lyh0LO_6rTCk9npC96SE&cid=Y8jg8v1AyqU&s=raise%20StopIteration&fp=41191&mp&projSelected=true#L0 http://code.openhub.net/file?fid=PTjGrE_5rOhyZhL1CUrPBtRk7n8&cid=tWtPpAs4E1g&s=raise%20StopIteration&fp=210789&mp&projSelected=true#L0 http://code.openhub.net/file?fid=WzkucGktJhjsP8cj4BO6Wcnbx-0&cid=fsj7E8vdVMA&s=raise%20StopIteration&fp=401086&mp&projSelected=true#L0 http://stackoverflow.com/questions/6784934/python-yield-and-stopiteration-in... http://stackoverflow.com/questions/14183803/in-pythons-generators-what-is-th... That last example not only uses raise to exit the generator, but the author actually guesses that it is the more Pythonic way to do so. Here is a description of the generator protocol which could easily lead the reader to conclude that raising StopIteration is the correct way to exit a generator: To support this protocol, functions with yield statement are compiled specially as generators. They return a generator object when they are called. The returned object supports the iteration interface with an automatically created __next__() method to resume execution. Generator functions may have a return simply terminates the generation of values by raising a StopIteration exceptions after any normal function exit. http://www.bogotobogo.com/python/python_generators.php At this point, I'm convinced that there is a good argument for a __future__ import changing this behaviour. But I suspect that making this the default behaviour in the future will break a lot of code. -- Steven

On Thu, Nov 20, 2014 at 11:25 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Interesting. But "yield instead of return" doesn't automatically say "and then raise StopIteration to early-abort"; I'd say the informal description is fine, it just needs to be modified differently once people actually want an early abort. ("You can still use 'return' for its other purpose, terminating a function before reaching the end.")
Examples of code in the wild using StopIteration to exit:
Trivially unnecessary, and as soon as there's a bug report, the "What's New In 3.7" page will explain that it needs to be removed.
I have no idea what this one is doing, but it looks like it's half way to what's wanted here. Catch the exception and deal with it... this proposal just means the "deal with it" part needs to be reworded into a return statement. All it needs is for "What's New in 3.5" to recommend use of 'return' instead of 'raise StopIteration', and all these cases will be easily compatible with all [1] versions of Python.
http://stackoverflow.com/questions/6784934/python-yield-and-stopiteration-in...
The accepted answer correctly advises the function be written to simply return. This will work fine. The other answer has a local "raise StopIteration", which can be translated into a simple "return".
The question's author does, but the accepted answer recommends "return". This may result in the odd question here or there, but it's not a major problem. Any time a generator has "raise StopIteration" in its own body, it can simply become "return". That's easy. The issue comes up when it's not raising that itself, but is letting it bubble up - maybe from a next() call. def takewhiletrue(iter): while True: # coincidental with the function name # try: val = next(iter) # except StopIteration: return if not val: break yield val This won't come up in a simple search for "raise StopIteration", and if you have something like this where the condition is almost always going to be reached eventually, you might not notice the problem for a long time. How would you know to add the commented-out lines? What kind of code search would you use to detect this?
Even that does recommend 'return'. If anyone reads that, writes "raise StopIteration", sees code bombing with RuntimeError, and then comes to python-list, we can explain that the recommended method is "return". I have no problem with this. There are plenty of much-worse practices that people pick up - mixing bytes and text, using backslashes in Windows path names without doubling them or using raw literals, etc, etc, etc. In lots of cases they'll seem to work ("C:\Program Files\New Stuff\Testing" will work, until you lower-case the name), but when they break, you just have to fix them. This wouldn't be the first Python minor version to tighten up requirements to remove bug magnets.
I suspect that a huge percentage of the code so broken can be trivially fixed just by search/replacing "raise StopIteration" with "return". There'll be only a very few where the StopIteration is raised from some other function and needs to be caught and dealt with - and fixing those is just as likely to reveal bugs needing fixing. ChrisA

On Thu, Nov 20, 2014 at 12:34:08PM +1100, Chris Angelico wrote:
On Thu, Nov 20, 2014 at 11:25 AM, Steven D'Aprano <steve@pearwood.info> wrote:
The point isn't that it is easy to fix. I'm sure that there will be cases of code that are not easy to fix. The point is that we're breaking working code and causing code churn. We're not fixing a bug. We're changing behaviour people rely on. That ought to make us more conservative about breaking their code.
This may result in the odd question here or there, but it's not a major problem.
And neither is the existing behaviour. We're weighing up whether the small benefit in fixing this wart is worth the pain. The PEP isn't approved yet, and right from the beginning Guido said that he feared that fixing this might be too disruptive. I'm trying to get a feel for how disruptive it will be. I did a quick and informal survey of the developers I work with. The dreaded lurgy has gone through our office, so most of them are away ill, but of those still here (all two of them) one of them couldn't remember whether you exit a generator with "yield nil" or "return nil" (he's a Ruby and Objective-C guy when we're not paying him to write Python) and the other one said that the whole problem is that generators exist in the first place, Python should get rid of them and allow people to define their own using macros (he likes to think of himself as a Lisp and Scheme guru :-) Make of that what you will. -- Steven

On Thu, Nov 20, 2014 at 4:30 PM, Steven D'Aprano <steve@pearwood.info> wrote:
There's a language that lets you define anything you like. It's called "file on disk". If you don't like how it runs, you just put a little shebang at the top and the whole rest of the file is interpreted differently... On one side of the balance is code breakage. On the other side is currently broken code where bugs will be found. Which is the stronger argument? I'm inclined toward the latter, but neither has a huge body of code to back it. ChrisA

On 11/19/2014 04:25 PM, Steven D'Aprano wrote:
We are not, however, responsible for third-party documentation.
Isn't that the case with every __future__ directive that becomes the standard? Folks have an entire minor release to make the adjustment. -- ~Ethan~

On Wed, Nov 19, 2014 at 08:19:41PM -0800, Ethan Furman wrote:
We are not, however, responsible for third-party documentation.
No, of course not, but we should be aware that: * some people believe that raising StopIteration is an acceptable way to exit a generator; and * doing so has worked fine since generators were introduced back in Python 2.2. I wonder whether people who learned about generators back in the 2.2 days will have stronger opinions about raising StopIteration than more recent users? I remember learning that an explicit raise was the way to exit a generator, and sure enough the 2.2 What's New says this: Inside a generator function, the return statement can only be used without a value, and signals the end of the procession of values; afterwards the generator cannot return any further values. return with a value, such as return 5, is a syntax error inside a generator function. The end of the generator’s results can also be indicated by raising StopIteration manually, or by just letting the flow of execution fall off the bottom of the function. https://docs.python.org/3/whatsnew/2.2.html#pep-255-simple-generators That's not to say that we can't change the behaviour, but neither can we say it is undocumented or blame third parties.
Huh, you say that like it's a long time :-) -- Steven

On 11/19/2014 11:24 AM, Chris Angelico wrote:
Which, as I said a week ago, is why there is no need for "raise StopIteration" in a generator function. The doc clearly states the limited intended use of StopIteration. ''' exception StopIteration Raised by built-in function next() and an iterator‘s __next__() method to signal that there are no further items produced by the iterator. ''' StopIteration is exposed so it can be raised in user coded __next__() and caught when using explicit next(). If it was only used for builtins and for loops, it would not need to be visible.
Why should a generator author have to be concerned about one particular exception having magical meaning?
I am not sure of your intent with this rhetorical (?) question.
The prompt should be "Enter the next string or hit ^C to quit: ".
It is avoidable because the return type of producer is limited to strings. Therefore, producer could (and perhaps should) itself catch KeyboardInterrupt and return None, which is intended for such use. Consumer would then be simplified by replacing 3 lines with "if command is None: break".
-- Terry Jan Reedy

On Thu, Nov 20, 2014 at 9:46 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Yes, rhetorical. Basically saying the same as you are: that StopIteration is a part of __next__, not generators.
Yeah, the point is about its interaction with the rest of the program, not the human.
Sure it does. But suppose it does some parsing on the string first, and that parsing might return literally any object. The structure of the program is the same, but now it really does need to signal "no more stuff" in some way other than return value. Just trying to concoct a situation similar to generators/for loops, using a different exception. I'm fairly sure there's no way to make the above system seem truly plausible, because KeyboardInterrupt is a bad exception for the purpose; but it's still broadly similar, and I think the same applies: StopException should be *only* inside __next__() and next(). Since generators can distinguish yield from return, they don't need to distinguish return from raise. ChrisA

On Thu, Nov 20, 2014 at 9:57 AM, Chris Angelico <rosuav@gmail.com> wrote:
Since generators can distinguish yield from return, they don't need to distinguish return from raise.
Bad grammar, should edit before posting. Since generators can distinguish value from no value by using yield and return, they don't need to use yield and raise. ChrisA

On Thu, Nov 20, 2014 at 03:24:07AM +1100, Chris Angelico wrote:
That's not true in practice. See my reply to Nick, there is lots of code out there which uses StopIteration to exit generators. Some of that code isn't very good code -- I've seen "raise StopIteration" immediately before falling out the bottom of the generator -- but until now it has worked and the impression some people have gotten is that it is actually preferred.
Until 3.2, that was a syntax error. For the majority of people who are still using Python 2.7, it is *still* a syntax error. To write this in a backwards-compatible way, you have to exit the generator with: raise StopIteration(2)
I would put it another way: informally, the distinction between a generator and a function is that generators use yield where functions use return. Most people are happy with that informal definition, a full pedantic explanation of co-routines will just confuse them or bore them. The rule they will learn is: * use return in functions * use yield in generators That makes generators that use both surprising. Since most generators either run forever or fall out the bottom when they are done, I expect that seeing a generator with a return in it is likely to surprise a lot of people. I've known that return works for many years, and I still give a double-take whenever I see it in a generator.
Why not? How else are you going to communicate something out of band to the consumer except via an exception? We can argue about whether KeyboardInterrupt is the right exception to use or not, but if you insist that this is a bad protocol then you're implicitly saying that the iterator protocol is also a bad protocol.
You might be, but since I've paid attention to the protocol rules, I won't be. Sorry to be harsh, but how clear do we have to be? StopIteration terminates iterators, and generators are iterators. That rule may or may not be inconvenient, it might be annoying (but sometimes useful), it might hide bugs, it might even be something that we can easily forget until reminded, but if it comes as a "surprise" that just means you don't know how the iterator protocol works. There are good reasons for changing this behaviour, but pandering to people who don't know how the iterator protocol works is not one of them.
That's exactly how the protocol works. Even if you write "return" in your generator, it still raises StopIteration.
Without the generator, *only* __next__ has this effect, and that's exactly where it's documented to be.
The documentation states that __next__ raises StopIteration, it doesn't say that *only* __next__ should raise StopIteration. https://docs.python.org/3/library/stdtypes.html#iterator.__next__ I trust that we all expect to be able to factor out the raise into a helper function or method, yes? It truly would be surprising if this failed: class MyIterator: def __iter__(self): return self def __next__(self): return something() def something(): # Toy helper function. if random.random() < 0.5: return "Spam!" raise StopIteration Now let's write this as a generator: def gen(): while True: yield something() which is much nicer than: def gen(): while True: try: yield something() except StopIteration: return # converted by Python into raise StopIteration -- Steven

On Thu, Nov 20, 2014 at 1:06 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Yes, I thought it was rare. I stand corrected. Reword that to "you don't *need to* explicitly raise", since you can simply return, and it becomes true again, though.
In most cases you won't need to put a value on it, so bare "return" will work just fine. I just put a return value onto it so it wouldn't look trivially useless.
But it's just as surprising to put "raise StopIteration" into it. It's normal to put that into __next__, it's not normal to need it in a generator. Either way, it's something unusual; so let's go with the unusual "return" rather than the unusual "raise".
Well, that's exactly what I do mean. KeyboardInterrupt is not a good way for two parts of a program to communicate with each other, largely because it can be raised unexpectedly. Which is the point of this PEP: raising StopIteration unexpectedly should also result in a noisy traceback.
Sure. There was a suggestion that "return yield from something()" would work, though, which - I can't confirm that this works, but assuming it does - would be a lot tidier. But there's still a difference. Your first helper function was specifically a __next__ helper. It was tied intrinsically to the iterator protocol. If you want to call a __next__ helper (or actually call next(iter) on something) inside a generator, you'll have to - if this change goes through - cope with the fact that generator protocol says "return" where __next__ protocol says "raise StopIteration". If you want a generator helper, it'd look like this: def something(): # Toy helper function. if random.random() < 0.5: yield "Spam!" def gen(): yield from something() Voila! Now it's a generator helper, following generator protocol. Every bit as tidy as the original. Let's write a __getitem__ helper: def something(x): # Toy helper function. if random.random() < 0.5: return "Spam!" raise KeyError(x) class X: def __getitem__(self, x): return something(x) Same thing. As soon as you get into raising these kinds of exceptions, you're tying your helper to a specific protocol. All that's happening with PEP 479 is that generator and iterator protocol are being distinguished slightly. ChrisA

On 20.11.2014 03:24, Chris Angelico wrote:
Hmm, I'm not convinced by these toy examples, but I did inspect some of my own code for incompatibility with the proposed change. I found that there really is only one recurring pattern I use that I'd have to change and that is how I've implemented several file parsers. I tend to write them like this: def parser (file_object): while True: title_line = next(file_object) # will terminate after the last record try: # read and process the rest of the record here except StopIteration: # this record is incomplete raise OSError('Invalid file format') yield processed_record So I'm catching StopIteration raised by the underlying IOWrapper only if it occurs in illegal places (with regard to the file format the parser expects), but not when it indicates the end of a correct file. I always thought of letting the Error bubble up as a way to keep the parser transparent. Now in this case, I think, I would have to change this to: def parser (io_object): while True: try: title_line = next(io_object) except StopIteration: return ... which I could certainly do without too much effort, but could this be one of the more widespread sources of incompatibility that Steve imagines ? Wolfgang

On Thu, Nov 20, 2014 at 2:39 PM, Wolfgang Maier < wolfgang.maier@biologie.uni-freiburg.de> wrote:
There's probably something important missing from your examples. The above while-loop is equivalent to for title_line in io_object: ... If you're okay with getting RuntimeError instead of OSError for an undesirable StopIteration, you can just drop the except clause altogether. -- --Guido van Rossum (python.org/~guido)

On 21.11.2014 00:51, Guido van Rossum wrote:
My reason for not using a for loop here is that I'm trying to read from a file where several lines form a record, so I'm reading the title line of a record (and if there is no record in the file any more I want the parser generator to terminate/return. If a title line is read successfully then I'm reading the record's body lines inside a try/except, i.e. where it says "# read and process the rest of the record here" in my shortened code I am actually calling next several times again to retrieve the body lines (and while reading these lines an unexpected StopIteration in the IOWrapper is considered a file format error). I realize that I could also use a for loop and still call next(file_object) inside it, but I find this a potentially confusing pattern that I'm trying to avoid by using the while loop and all explicit next(). Compare: for title_line in file_object: record_body = next(file_object) # in reality record_body is generated using several next calls # depending on the content found in the record body while it's read yield (title_line, record_body) vs while True: title_line = next(file_object) body = next(file_object) yield (title_line, body) To me, the for loop version suggests to me that the content of file_object is read in line by line by the loop (even though the name title_line tries to hint at this being not true). Only when I inspect the loop body I see that further items are retrieved with next() and, thus, skipped in the for iteration. The while loop, on the other hand, makes the number of iterations very clear by showing all of them in the loop body. Would you agree that this is justification enough for while instead of for or is it only me who thinks that a for loop makes the code read awkward ?
If you're okay with getting RuntimeError instead of OSError for an undesirable StopIteration, you can just drop the except clause altogether.
Right, I could do this if the PEP-described behavior was in effect today.

On Fri, Nov 21, 2014 at 9:19 PM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
I agree. The last example in the PEP is a cut-down form of your parser, and I raise the exact same concern: https://www.python.org/dev/peps/pep-0479/#examples The use of the for loop strongly implies that the loop body will be executed once for each thing in the iterable, which isn't true if you next() it in the body. Legal? Sure. Confusing? Definitely. ChrisA

On 20.11.2014 03:06, Steven D'Aprano wrote:
I find this example a compelling argument against the PEP. Personally, I'm dealing a lot more often with refactoring a generator function into a iterator class than I'm rewriting generator expressions into comprehensions (at least the exotic kinds that would reveal their inequality). So for me at least, the burden of having to remember that I can let (and should let) StopIteration bubble up inside __next__, but not in generator functions weighs in heavier than the equality argument and the protection against hard-to-diagnose (but rarely occurring) bugs in nested generator functions.

On Fri, Nov 21, 2014 at 9:58 AM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
Compare my earlier response to Steven, though: it's not difficult to refactor a generator into a helper-generator, rather than refactor a generator into a helper-__next__. This proposal would force a decoupling of generator protocol from __next__ protocol. The ugliness in Steven's examples comes from trying to use a __next__ helper in a generator. It'd be just as ugly trying to refactor __getitem__ to make use of a __getattr__ helper - you'd have to catch AttributeError and turn it into KeyError at the boundary between the two protocols. ChrisA

Please let me know if I'm reading the PEP correctly. Does the proposal break all existing code in generators that uses next() to raise StopIteration or that raises StopIteration explicitly? For example, here is the pure python recipe for itertools.accumulate() show in the docs at https://docs.python.org/3/library/itertools.html#itertool-functions <https://docs.python.org/3/library/itertools.html#itertool-functions> : def accumulate(iterable, func=operator.add): 'Return running totals' # accumulate([1,2,3,4,5]) --> 1 3 6 10 15 # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120 it = iter(iterable) total = next(it) yield total for element in it: total = func(total, element) yield total Or would it break the traditional examples of how to write something like izip() using a generator? def izip(iterable1, iterable2): it1 = iter(iterable1) it2 = iter(iterable2) while True: v1 = next(it1) v2 = next(it2) yield v1, v2 assert list(izip('ab', 'cde')) == [('a', 'c'), ('b', 'd')] assert list(izip('abc', 'cd')) == [('a', 'c'), ('b', 'd')] My initial reading of the PEP was a bit unsettling because the listed examples (such as unwrap() and parser()) were a series of cases where code that was currently working just fine for the last decade would break and need be changed to less pleasant looking code. Also, the PEP motivation seemed somewhat weak. Instead of listing known bugs or real-world development difficulties, it seems to hinge almost entirely some "being surprised" that list comprehensions and generator expressions aren't the same in every regard (they aren't). AFAICT, that suggestion is that an incorrect expectation of perfect symmetry warrants a number of what the author calls "consequences for existing code". It seems that if the real problem is one of false expectations or surprises, the direct solution would be to provide clearer examples of how things actually work and to disabuse the idea that list comprehensions and generator expressions are more interchangeable than they actually are. Raymond P.S. On a more general note, I think that our biggest problem in the Python world is getting people to switch to Python 3. If we really want that to happen, we should develop a strong aversion to proposals that further increase the semantic difference between Python 2 and Python 3.

On Fri, Nov 21, 2014 at 10:24 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
The case where the iterable is empty would now raise, yes.
Yes, this would be affected. This proposal causes a separation of generators and iterators, so it's no longer possible to pretend that they're the same thing.
The main point is one of exceptions being silently suppressed. Iterator protocol involves the StopIteration exception; generator protocol doesn't, yet currently a generator that raises StopIteration will quietly terminate. It's as if every generator is wrapped inside "try: ..... except StopIteration: pass". Would you accept any function being written with that kind of implicit suppression of any other exception?
The recommended form of the code will work exactly the same way in both versions: explicitly catching StopIteration and using it as a signal that the function should terminate. The only difference is the behaviour of the non-recommended practice of allowing an exception to bubble part-way and then be implicitly caught. ChrisA

On 21 November 2014 21:50, Chris Angelico <rosuav@gmail.com> wrote:
Raymond's point is that for a long time, the equivalence between "return" and "raise StopIteration" in a generator function has been explicit. The dissatisfaction with the "non-local flow control" aspects of the latter really only started to creep in around Python 2.5 (based on the explicit decision to avoid non-local flow control behaviour in the definition of the with statement in PEP 343), and this PEP is the first time this longstanding behaviour of generators has been seriously questioned at the python-dev level. Guido also didn't add himself as a co-author on the PEP, so it isn't clear on first reading that *he's* the one considering the change, rather than it being an independent suggestion on your part :) I suspect enough evidence of breakage is accumulating to tip the balance back to "not worth the hassle", but it would also be possible to just *add* the "from __future__ import generator_stop" feature, and postpone a decision on making that the only available behaviour. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Nov 21, 2014, at 10:55 PM, Nick Coghlan wrote:
I have no opinion on the actual PEP, but I'm not sure the above is a good resolution. future imports should be for things that have a clear path to default behavior in some future release. I don't think we should incur technical debt to future-ize a feature that won't eventually get adopted. Such a thing will just be another wart that will be difficult to remove for backward compatibility. Cheers, -Barry

On Fri, Nov 21, 2014 at 10:50:52PM +1100, Chris Angelico wrote:
On Fri, Nov 21, 2014 at 10:24 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
But generators and iterators *are the same thing*. (Generator functions are not iterators, but generators themselves are.) Iterators don't have a specific type, but they obey the iterator protocol: py> def gen(): ... yield 1 ... yield 2 ... py> it = gen() py> iter(it) is it True py> hasattr(it, '__next__') True `it` is an iterator.
Yes. That's how the classic pre-iterator iteration protocol works: py> class K: ... def __getitem__(self, i): ... if i == 5: raise IndexError ... return i ... py> x = K() py> list(x) [0, 1, 2, 3, 4] Context managers support suppressing any exception which occurs: If the suite was exited due to an exception, and the return value from the __exit__() method was false, the exception is reraised. If the return value was true, the exception is suppressed, and execution continues with the statement following the with statement. https://docs.python.org/3/reference/compound_stmts.html#the-with-statement So there's two examples, one of the oldest going back to Python 1 days, and one of the newest. There may be others. -- Steven

On Sat, Nov 22, 2014 at 3:30 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I can write many other factory functions which return iterators. They are not, themselves, iterators, and therefore should not be expected to follow iterator protocol. def gen(): return iter([1,2])
The above function works with those tests, too. Generator functions are functions that return iterators, and the __next__ method of the returned object is what follows iterator protocol.
That's following getitem protocol, and it's part of that protocol for the raising of IndexError to be the way of not returning any value. But what's more surprising is that raising StopIteration will also silently halt iteration, which I think is not good:
list(K()) [0, 1, 2, 3, 4]
Context managers get a chance to function like a try/except block. If one silently and unexpectedly suppresses an exception, it's going to be surprising; but more likely, it's as clear and explicit as an actual try/except block. This isn't "as soon as you use a 'with' block, any XyzError will jump to the end of the block and keep going". ChrisA

On Fri, Nov 21, 2014 at 08:52:59AM -0800, Ethan Furman wrote:
"Must not support send()" has never been part of the definition of iterators. The `Iterator` ABC also recognises generators as iterators: py> def gen(): ... yield 1 ... py> from collections import Iterator py> isinstance(gen(), Iterator) True and they are documented as iterators: Python’s generators provide a convenient way to implement the iterator protocol. If a container object’s __iter__() method is implemented as a generator, it will automatically return an iterator object (technically, a generator object) supplying the __iter__() and __next__() methods. https://docs.python.org/3/library/stdtypes.html#generator-types I don't understand where this idea that generators aren't iterators has come from, unless it is confusion between the generator *function* and the generator object itself. -- Steven

On Nov 21, 2014, at 8:52, Ethan Furman <ethan@stoneleaf.us> wrote:
Generators are a subtype of iterators. They support the iterator protocol completely, and in the same way as any other iterator. They also support extensions to that protocol--e.g., send(). But they also have a relationship to a generator function or generator expression, which you could call a "protocol" but if so it's not one expressible at the level of the language. I think that leads to a bit of confusion when speaking loosely. When someone says "the generator protocol vs. the iterator protocol" the "obviously correct" meaning is send and throw, but it's not what people always mean. Then again, the word "generator" itself leads to confusion when speaking loosely. Maybe it would be clearer if "generator" had no meaning; generator functions return generator iterators. But I don't think this confusion has caused serious problems over the decades, so I doubt the more minor confusion at issue here is likely to be serious.

On Fri, Nov 21, 2014 at 9:18 AM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
interesting -- I've always called those "generator comprehensions" -- but anyway, -- do they have a special relationship? I can put any iterable in a generator expression: gen_exp = (i for i in [3,4,5,6]) the result is a generator: In [5]: type(gen_exp) Out[5]: generator so I guess you could call that a "special relationship" -- but it looks to me kind of like an alternate constructor. But in any case, you can use a generator created by a generator expression or a generator function the same way you can use a iterable or an iterator class. Then again, the word "generator" itself leads to confusion when speaking
loosely. Maybe it would be clearer if "generator" had no meaning; generator functions return generator iterators.
not sure how that would help -- a generator is a type, and it is created by either calling a generator function or a generator expression. if there is confusion, it's when folks call a generator function a "generator" Anyway, I just went back and read the PEP, and I'm still confused -- would the PEP make generators behave more like iterator classes, or less like them? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Nov 22, 2014 at 4:51 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Neutral. A generator function, an iterator class, etc, etc, etc, exists solely to construct an iterator. That iterator has a __next__ method, which either returns a value or raises StopIteration, or raises some other exception (which bubbles up). There are two easy ways to write iterators. One is to construct a class: class Iter: def __init__(self): self.x = 0 def __iter__(self): return self def __next__(self): if self.x == 3: raise StopIteration self.x += 1 return self.x Another is to write a generator function: def gen(): yield 1 yield 2 yield 3 Both Iter and gen are callables which return iterators. Both of them will produce three integers and then raise StopIteration. Both will, as is good form for iterators, continue to raise StopIteration thereafter. And neither Iter nor gen is, itself, an iterator. One is a class which constructs iterators. The other is a generator function, which also constructs iterators. That's all. In Iter.__next__, I wrote code which chose between "return" and "raise StopIteration" to define its result; in gen(), I wrote code which chose between "yield" and "return" (in this case, the implicit return at the end of the function) to define its result. The only change made by this proposal is that StopIteration becomes, in a generator, like any other unexpected exception. It creates a separation between "iterator protocol" (which is implemented by __next__) and "generator protocol" (which is written in the body of a function with 'yield' in it). ChrisA

As someone who has written maybe one generator expression in production code, I have little opinion on the PEP. But as someone that teaches Python, I have a comment on: On Fri, Nov 21, 2014 at 10:50:52PM +1100, Chris Angelico wrote:
As pointed out by Steven, the _are_ the same thing. When I teach interators and generators, I get a bit tangled up explaining what the difference is, and why Python has both. This is what I say: Conceptually ( outside of language constructs): An "generator" is something that, well, generates value on the fly, as requested, until there are no more to generate, and then terminates. A "iterator" on the other had is something that produces the values in a pre-existing sequence of values, until there are no more. IN practice, python uses the exact same protocol (the iterator protocol -- __iter__, __next__) for both, so that you can write, e.g. a for loop, and not have to know whether the underlying object you are looping through is iterating or generating... As you can write a "generator" in the sense above in a class that supports the iterator protocol (and, can, in fact, write an "iterator" with a generator function), then I say that generator functions really are only syntactic sugar -- they are short and sweet and do much of the book keeping for you. But given all that keeping the protocols as similar as possible is a *good* thing, not a bad one -- they should behave as much as possible teh same. If StopIteration bubbles up from inside an iterator, wouldn't that silently terminate as well? Honestly, I'm a bit lost -- but my point is this -- generators and iterators should behave as much the same as possible. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Nov 22, 2014 at 3:53 AM, Chris Barker <chris.barker@noaa.gov> wrote:
If you want to consider them that way, then sure - but part of the bookkeeping they do for you is the management of the StopIteration exception. That becomes purely an implementation detail. You don't actually use it when you write a generator function. ChrisA

On Fri, Nov 21, 2014 at 8:53 AM, Chris Barker <chris.barker@noaa.gov> wrote:
I'm sorry you see it that way; we must have done a terrible job explaining this in the past. :-( The behavior for the *consumer* of the iteration is unchanged (call next() until it raises StopIteration -- or let a for-loop take care of the details for you). The interface for the *producer* has never been all that similar: In a generator you *yield* subsequent values until you are done; but if you are not using a generator, you must define a __next__() method (next() in Python 2) that *returns* a single value each time, until it's done, and then it has to raise StopIteration. There is no need to raise StopIteration from a generator, you just return when you are done. Insisting that raising StopIteration in a generator makes it more similar to a __next__() method ignores the fact that producing values is done in a completely different ways. So, again, the PEP does not change anything about iterators, and generators will continue to follow the iterator protocol. The change is only for generator authors (and, most importantly, for people using a certain hack in generator expressions). -- --Guido van Rossum (python.org/~guido)

On Fri, Nov 21, 2014 at 2:29 PM, Guido van Rossum <guido@python.org> wrote:
well, others have found examples in old docs that mingle StopIteration and generators...so I guess so, but I'm not sure I'm that misinformed. It still seems to me that there are two ways to write the same thing. The behavior for the *consumer* of the iteration is unchanged
got it -- the issue at hand is what happens to a StopIteration that is raised by something the generator calls. I think the point of this PEP is that the author og a generator function is thinking about using "yield" to provide the next value, and return (explicit or implicit) to stop the generation of objects. That return is raise a StopIteration, but the author isn't thinking about it. So why would they want to think about having to trap StopIteration when calling other functions. While the author of a iterator class is thinking about the __next__ method and raising a StopIteration to terminate. So s/he would naturally think about trapping StopIteration when calling functions? I suppose that makes some sense, but to me it seems like a generator function is a different syntax for creating what is essentially the same thing -- why shouldn't it have the same behavior? and of you are writing a generator, presumably you know how it's going to get use -- i.e. by somethign that expects a StopIteration -- it's not like you're ignorant of the whole idea. Consider this far fetched situation: Either a iterator class or a generator function could take a function object to call to do part of its work. If that function happened to raise a StopIteration -- now the user would have to know which type of object they were workign with, so they would know how to handle the termination of the iter/gener-artion OK -- far more far fetched than the proceeding example of confusion, but the point is this: AFAIU, the current distinction between generators and iterators is how they are written -- i.e. syntax, essentially. But this PEP would change the behavior of generators in some small way, creating a distinction that doesn't currently exist. So, again, the PEP does not change anything about iterators, and generators
will continue to follow the iterator protocol. The change is only for generator authors
I guess this is where I'm not sure -- it seems to me that the behavior of generators is being change, not the syntax -- so while mostly of concern to generator authors, it is, in fact, a chance in behavior that can be seen by the consumer of (maybe only an oddly designed) generator. In practice, that difference may only matter to folks using that particular hack in generator expression, but it is indeed a change. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Sat, Nov 22, 2014 at 10:06 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Let's suppose you use a Python-like macro language to generate Python code. In the macro language, you can write stuff like this: class X: iterator: return 5 return 6 return 7 iterator_done And it will get compiled into something like this: class X: def __init__(self): self._iter_state = 0 def __iter__(self): return self def __next__(self): self._iter_state += 1 if self._iter_state == 1: return 5 if self._iter_state == 2: return 6 if self._iter_state == 3: return 7 raise StopIteration This is a reasonably plausible macro language, right? It's basically still a class definition, but it lets you leave out a whole bunch of boilerplate. Now, the question is: As you write the simplified version, should you ever need to concern yourself with StopIteration? I posit no, you should not; it's not a part of the macro language at all. Of course, if this *were* how things were done, it would probably be implemented as a very thin wrapper, exposing all its details to your code; but there's no reason that it *needs* to be so thin. The language you're writing in doesn't need to have any concept of a StopIteration exception, because it doesn't need to use an exception to signal "no more results".
Not necessarily. Can we get someone here who knows asyncio and coroutines, and can comment on the use of such generators?
Either a __getattr__ or a __getitem__ could use a helper function to do part of its work, too, but either the helper needs to know which, or it needs some other way of signalling. They're different protocols, so they're handled differently. If Python wanted to conflate all of these, there could be a single "NoReturnValue" exception, used by every function which needs to be able to return absolutely any object and also to be able to signal "I don't have anything to return". But no, Python has separate exceptions for signalling "I don't have any such key", "I don't have any such attribute", and "I don't have any more things to iterate over". Generators don't need any of them, because - like my macro language above - they have two different keywords and two different byte-codes (yield vs return). In many cases, the helper function doesn't actually need the capability to return *absolutely any object*. In that case, the obvious solution would be to have it return None to say "nothing to return", and then the calling function can either translate that into the appropriate exception, or return rather than yielding, as appropriate. That would also make the helper more useful to other stand-alone functions. But even if your helper has to be able to return absolutely anything, you still have a few options: 1) Design the helper as part of __next__, and explicitly catch the exception. def nexthelper(): if condition: return value raise StopIteration def __next__(self): return nexthelper() def gen(): try: yield nexthelper() except StopIteration: pass 2) Write the helper as a generator, and explicitly next() it if you need that: def genhelper(): if condition: yield value def __next__(self): return next(genhelper()) def gen(): yield from genhelper() 3) Return status and value. I don't like this, but it does work. def tuplehelper(): if condition: return True, value return False, None def __next__(self): ok, val = tuplehelper() if ok: return val raise StopIteration def gen(): ok, val = tuplehelper() if ok: yield val All these methods work perfectly, because they have a clear boundary between protocols. If you want to write a __getitem__ that calls on the same helper, you can do that, and have __getitem__ itself raise appropriately if there's nothing to return.
Generators are currently a leaky abstraction for iterator classes. This PEP plugs a leak that's capable of masking bugs.
The only way a consumer will see a change of behaviour is if the generator author used this specific hack (in which case, instead of the generator quietly terminating, a RuntimeError will bubble up - hopefully all the way up until a human sees it). In terms of this PEP, that's a bug in the generator. Bug-free generators will not appear any different to the consumer. ChrisA

On Sat, Nov 22, 2014 at 10:40:37AM +1100, Chris Angelico wrote:
Let's suppose you use a Python-like macro language to generate Python code. In the macro language, you can write stuff like this:
Is there really any point in hypothesing imaginary macro languages when we have a concrete and existing language (Python itself) to look at? [snip made-up example]
Sure, why not? It is part of the concrete protocol: iterators raise StopIteration to halt. That's not a secret, and it is not an implementation detail, it is a concrete, public part of the API.
I posit no, you should not; it's not a part of the macro language at all.
This is why talking about imaginary macro languages is pointless. You say it is not part of the macro language. I say it is. Since the language doesn't actually exist, who is to say which is right? In real Python code, "raise StopIteration" does exist, and does work in generators. Sometimes the fact that it works is a nuisance, when you have an unexpected StopIteration. Sometimes the fact that it works is exactly what you want, when you have an expected StopIteration. You seem to think that allowing a generator function to delegate the decision to halt to a helper function is a Bad Thing. I say it is a Good Thing, even if it occasionally makes buggy code a bit harder to debug. [...]
What about them? I don't understand your question. -- Steven

On Sat, Nov 22, 2014 at 9:48 PM, Steven D'Aprano <steve@pearwood.info> wrote:
A generator function is exactly the same thing: it's a way to create an iterator, but it's not a class with a __next__ function. I could write an iterator-creation function in many ways, none of which involve StopIteration: def gen(): if condition: raise StopIteration # Wrong return iter([1,2,3])
If you have a lengthy nested chain of coroutines, and one of them unexpectedly raises StopIteration, is it right for something to quietly terminate, or should the exception bubble up and be printed to console? ChrisA

On Sat, Nov 22, 2014 at 2:57 AM, Chris Angelico <rosuav@gmail.com> wrote:
Couldn't you have a nested pile of iterator classes as well that would exhibit the exact same behavior? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 3:41 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Potentially, but that would be a different thing. Also, I don't know of cases where a __next__ function chains to a next() call through arbitrary numbers of levels, but :"yield from" gets a solid work-out in asyncio and related, so it's more likely to come up. But I don't personally use asyncio, so I'd like to hear from someone who does. ChrisA

I think this is a good point. Maybe a way to obtain equivalency to the generator functions in this case is to "break" this example for the iterator object as well, in that StopIteration has to be raised in the frame of the generator object; if it raised in a different context, e.g., a function called by __next__, that StopIteration should also be converted to a RuntimeError similar to what is proposed in the PEP for the generator functions. Maybe this is not what Chris intends to happen, but it would make things consistent. -Alexander

On Fri, Nov 21, 2014 at 4:56 PM, Alexander Heger <python@2sn.net> wrote:
I"mn not sure which Chris you are refering to, but for my part, yes and no: Yes, that would keep iterator classes and generator functions consistent, which would be a good thing. No: I don't think we should do that -- StopIteration is part of the iterator protocol -- generators are another way to write something that complies with the iterator protocol -- generators should handle StopIteration the same way that iterator classes do. Yes, there are some cases that can be confusing and hard to debug -- but them's the breaks. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 3:38 AM, Chris Barker <chris.barker@noaa.gov> wrote:
That's one of the perils of geeky communities - there'll often be multiple people named Chris. I have a brother named Michael who's a rail enthusiast, and that world has the same issue with his name - he was once in a car with three other people named Michael.
I've done my "explain it twice, then shut up" on this subject, so I'll just point you to the list archive, where it's been stated clearly that generators are like __iter__, not like __next__. Please, could you respond to previously-given explanations, rather than simply restating that generators should be like __next__? I'd like to move forward with that discussion, rather than reiterating the same points. ChrisA

On Mon, Nov 24, 2014 at 9:06 AM, Chris Angelico <rosuav@gmail.com> wrote:
I'm not sure if I've responded or not to previously given explanations -- but you're right, it's time for me to shut up having made my point, too. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 4:18 AM, Chris Barker <chris.barker@noaa.gov> wrote:
Well, there is probably more to be said about this - along the lines of *why* generators ought to be more like iterators. (They're iterables, not iterators.) It's just that we seem to be rehashing the same arguments - or maybe that's just my impression, as there's been discussion on three different channels (-ideas, -dev, and the issue tracker - mercifully very little on the latter). ChrisA

On Mon, Nov 24, 2014 at 10:02 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
I think Chris A was overzealous here. The word "generator" is ambiguous; it can refer to either a generator function (a function definition containing at least one "yield") or to the object you obtain by calling a generator function. The latter is definitely an iterator (it has a __next__ method). You can't really call a generator function an iterable (since calling iter() on it raises TypeError) but it's not an iterator either. For the rest see my explanation in response to Mark Shannon in python-dev: http://code.activestate.com/lists/python-dev/133428/ -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 25, 2014 at 5:02 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
My apologies. As Guido said, "generator" is ambiguous; though I was inaccurate as well. A generator *object* is, as you show above, an iterator; a generator *function* is not actually iterable, but it is an iterator factory. An iterator class is also an iterator factory. ChrisA

On Mon, Nov 24, 2014 at 5:48 PM, Chris Angelico <rosuav@gmail.com> wrote:
This is correct, and I don't think there is any ambiguity:
As explained in PEP 255, "a Python generator is a kind of Python iterator[1], but of an especially powerful kind." The other term introduced by PEP 255 is "generator function": "A function that contains a yield statement is called a generator function." In my view, PEP 479 naturally follows from careful reading of PEP 225. All one needs to understand is the difference between a function that returns an iterator and its value.

yes, this was a reply to your post
Yes, that would keep iterator classes and generator functions consistent, which would be a good thing.
I think the main goal was to have a consistent interface that is easy to debug and deals with StopIteration bubbling up - hence such Exception from other scopes should convert to RuntimeError when crossing the iterator interface boundary originating from other scopes.
You'd Keep StopIteration in the protocol, but only allow it in the local scope. -Alexander

On 21.11.2014 12:24, Raymond Hettinger wrote:
Since I already learnt quite a lot from following this thread: I checked yesterday what the docs have to say about the pure-python equivalent of python3's zip() because I expected it to look like the above izip recipe (making it incompatible with the PEP behavior). However, I found that the given equivalent code is: def zip(*iterables): # zip('ABCD', 'xy') --> Ax By sentinel = object() iterators = [iter(it) for it in iterables] while iterators: result = [] for it in iterators: elem = next(it, sentinel) if elem is sentinel: return result.append(elem) yield tuple(result) i.e., there is no unprotected next call in this example. What surprised me though is that the protection here is done via the default argument of next() while more typically you'd use a try/except clause. So what's the difference between the two ? Specifically, with a default value given will next just catch StopIteration, which you could do less verbosely yourself and/or is there some speed gain from the fact that the Error has to be propagated up one level less ? Is there a guideline when to use try/except vs. next with a default value ? Thanks, Wolfgang

On 11/21/2014 03:24 AM, Raymond Hettinger wrote:
I believe the motivation is more along the lines of the difficulty and time wasted in debugging a malfunctioning program when a generator stops early because a StopIteration escaped instead of having some other exception raised. This would be along the same lines as not allowing sum to work with str -- a more valid case, IMO, because the sum restriction is performance based, while this change would actually prevent breakage... or more accurately, put the breakage at the cause and make it much easier to fix. -- ~Ethan~

On 15.11.2014 10:29, Chris Angelico wrote:
Now that this PEP is going to be accepted, I'm not sure how much sense it still makes to suggest an amendment to it, but anyway: As stated in the abstract one of the goals of the PEP is to unify further the behaviour of comprehensions and generator expressions. With the PEP in place the following example (taken from Steven d'Aprano's post on python-list): iterable = [iter([])] list(next(x) for x in iterable) would raise an error just like [next(x) for x in iterable] already does today. However the comprehension currently raises StopIteration, while the proposed error for the generator expression would be of a different class (supposedly RuntimeError) - so comprehensions and generator expressions would still behave a bit (though much less) differently. In addition, the PEP leaves an iterator's __next__() method as the only reasonable place where user-code should raise StopIteration. So I would like to argue that instead of just turning StopIteration into some other error when it's about to bubble out of a generator frame, it should be converted whenever it bubbles out of *anything except an iterator's __next__()*. This would include comprehensions, but also any other code. (On the side, I guess the current form of the PEP does address hard-to-debug bugs caused by nested generators, but what about nested __next__ in iterators ? Shouldn't it using the same logic also be an error if a next call inside a __next__ method raises an uncaught StopIteration ?) I think such general behavior would make it much clearer that StopIteration is considered special and reserved for the iterator protocol. Of course, it might mean more broken code if people use StopIteration or a subclass for error signaling outside generator/iterators, but this PEP will mean backwards incompatibility anyway so why not go all the way and do it consistently. I'm not sure I'd like the pretty general RuntimeError for this (even though Guido favors it for simplicity), instead one could call it UnhandledStopIteration ? I imagine that a dedicated class would help in porting, for example, python2 code to python3 (which this PEP does not really simplify otherwise) since people/scripts could watch out for something specific ? Thoughts? Wolfgang

On Tue, Nov 25, 2014 at 9:53 AM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
There'd have to be a special case for next(), where StopIteration is part of the definition of the function. The question then becomes, what's the boundary where StopIteration is converted? The current proposal is quite simple. All the conversion happens in the one function that (re)starts a generator frame, gen_send_ex() in Objects/genobject.c. To do this for other functions, there'd need to be some way of saying which ones are allowed to raise StopIteration and which aren't. Additionally, the scope for problems is smaller. A StopIteration raised anywhere outside of a loop header won't cause silent behavioral change; with generators, anywhere in the body of the function will have that effect. So, while I do agree in principle that it'd be nice, I don't know that it's practical; however, it might be worth raising a dependent proposal to extend this exception conversion. ChrisA

On 11/25/2014 12:03 AM, Chris Angelico wrote:
Well, I'm not familiar with every implementation detail of the interpreter so I can't judge how difficult to implement certain things would be, but one solution that I could think of is: allow StopIteration to be raised anywhere, but let it bubble up only *one* frame. So if the next outer frame does not deal with it, the exception would be converted to UnhandledStopIteration (or something else) when it's about to bubble out of that outer frame. The builtin next() would simply reset the frame count by catching and reraising StopIteration raised inside its argument (whether that's an iterator's __next__ or a generator; note that in this scenario using raise StopIteration instead of return inside a generator would remain possible). Examples of what would happen: using next on a generator that raises StopIteration explicitly: => next catches the error and reraises StopIteration using next on a generator that returns: => next behaves like currently, raising StopIteration using next on the __next__ method of an iterator: => next catches the error and reraises StopIteration every direct call of an iterator's __next__ method: => has to be guarded by a try/except StopIteration Likewise in the first three cases, the calling frame, which resumes when next returns, (and only this frame) is given a chance to handle the error. If that doesn't happen (i.e. the error would bubble out) it gets converted. So different from the current PEP where a StopIteration must be dealt with explicitly using try/except only inside generators, but bubbles up everywhere else, here StopIteration will be special everywhere, i.e., it must be passed upwards explicitly through all frames or will get converted. Back to Steven's generator expression vs comprehension example: iterable = [iter([])] list(next(x) for x in iterable) would raise UnhandledStopIteration since there is no way, inside the generator expression to catch the StopIteration raised by next(x). ... and if that's all complete nonsense because of some technical detail I'm not aware of, then please excuse my ignorance. Wolfgang

On Tue, Nov 25, 2014 at 9:30 AM, Wolfgang Maier < wolfgang.maier@biologie.uni-freiburg.de> wrote:
I also have no idea if this is practical from an implementation perspective, but I like how it support my goal of keeping the behavior of iterator classes and generators consistent. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 9:47 AM, Chris Barker <chris.barker@noaa.gov> wrote:
[...] I like how it support my goal of keeping the behavior of iterator classes and generators consistent.
This is a great summary of the general confusion I am trying to clear up. The behavior of all types of iterators (including generators) from the *caller's* perspective is not in question and is not changing. It is very simple: you call next(it) (or it.__next__(), and it returns either the next value or raises StopIteration (and any other exception is, indeed, an exception). producing a None value by the caller; returning from a generator is translated into a StopIteration which will be interpreted by the caller as the end of series. -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 25, 2014 at 9:59 AM, Guido van Rossum <guido@python.org> wrote:
Once you start nesting these things, the distinction between "implementor" and "caller" gets mingled. And I think this is all about how nested generators behave, yes? If I am implementing a iterator of some sort (generator function or iterator class), and I call next() inside my code, then I am both a implementor and caller. And if I'm also writing helper functions, then I need to know about how StopIteration will be handled, and it will be handled a bit differently by generators and iterator classes. But not a big deal, agreed, probably a much smaller deal that all the other stuff you'd better understand to write this kind of code anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014 at 12:31 PM, Chris Barker <chris.barker@noaa.gov> wrote:
Hm. An implementer of one protocol is likely the caller of many other protocols. It's not clear that calling something that implements the *same* protocol should deserve special status. For example I could be implementing an iterator processing the lines of a CSV file. Inside this iterator I may be using another iterator that loops over the fields of the current line. (This doesn't sound too far-fetched.) But if I run out of fields in a line, why should that translate into terminating the outer iterator? And the outer iterator may itself be called by another, more outer iterator that iterators over a list of files.
And I think this is all about how nested generators behave, yes?
The inner iterator doesn't have to be a generator (apart from send() and throw(), they have the same interface). And the point of the PEP is that an exhausted inner iterator shouldn't be taken to automatically terminate the outer one. (You could point out that I don't do anything about the similar problem when the outer iterator is implemented as a class with a __next__() method. If I could, I would -- but that case is different because there you *must* raise StopIteration to terminate the iteration, so it becomes more similar to an accidental KeyError being masked when it occurs inside a __getitem__() method.)
A helper function also defines an interface. If you are writing a helper function for a generator (and the helper function is participating in the same iteration as the outer generator, i.e. not in the CSV fies / lines / fields example), the best way to do it is probably to write it as a helper generator, and use "yield from" in the outer generator.
But not a big deal, agreed, probably a much smaller deal that all the other stuff you'd better understand to write this kind of code anyway.
Which I'm sorry to see is much less widely understood than I had assumed. -- --Guido van Rossum (python.org/~guido)

On Tue, Nov 25, 2014 at 1:05 PM, Guido van Rossum <guido@python.org> wrote:
(You could point out that I don't do anything about the similar problem when the outer iterator is implemented as a class with a __next__() method.
Indeed -- that is the key point here -- but you were pretty clear about how special casing StopIteration is a non-starter.
Well, I guess it a good thing to make things easier/clearer where you can -- even if you can't do it everywhere. I suppose if you think of generator functions as an easier way to write an iterator (where it makes sense) then this is one more thing that makes it even easier easier / safer. It does even more of the book keeping for you. So a consistent win-win. Thanks for the time taken clarifying your point. But not a big deal, agreed, probably a much smaller deal that all the other
stuff you'd better understand to write this kind of code anyway.
Which I'm sorry to see is much less widely understood than I had assumed.
Well, this PEP does make for one less detail you need to understand (or more to the point, keep in mind) when writing generator functions -- so that's a good thing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Nov 25, 2014, at 15:31, Chris Barker wrote:
For something more concrete, we can consider a naive implementation of iteration over adjacent pairs: def pairs(x): i = iter(x) while True: yield next(i), next(i)
To work under the new paradigm, you need to catch StopIteration explicitly: def pairs(x): i = iter(x) while True: try: a = next(i) b = next(i) except StopIteration: return yield a, b

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25/11/14 21:56, random832@fastmail.us wrote:
<snip>
You're right that you need to catch the StopIteration, but it seems to me the natural way to write your second example is: def pairs(x): i = iter(x) while True: try: yield next(i), next(i) except StopIteration: return Adding the extra variables a and b is unnecessary and distracts from the change actually required. Regards, Ian F -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJUdQ5lAAoJEODsV4MF7PWz/zsIALLI67/W+HAG3l0Xe+kd2/Xw QEI5NrOyT/izRHbV69K3zvOVKKCfiUXjkK5rPGxFiBmF96hOmQyro7Z4UiCSYzsT N+8dy8M6/gAWolEbD1EZoXZorNHa9nsZ8q3hBltl824CAl4Kx7FFKssUVIFWjyrD IgPjI4PIJBl12uX9F9VLMaBjfEy+QiCUa3a8s7ZdqS1asm1M4udei/qvt1t/NaIL uoYGuBO/mzxP9sdWtP4z53sX07gOMPUWdBTXFX91+G1pCaUuGCHpDcHwexatxVgk JxgfajyadFj+44QeHpsY10pu0HhaRH+Cbg7vADa2KQ6N3kQic1qHR9FkcHCvaA0= =SdtN -----END PGP SIGNATURE-----

On Wed, Nov 26, 2014 at 8:56 AM, <random832@fastmail.us> wrote:
Okay, it's simple and naive. How about this version: def pairs(x): i = iter(x) for val in i: yield val, next(i) Also simple, but subtly different from your version. What's the difference? Will it be obvious to everyone who reads it? ChrisA

On 11/25/2014 03:31 PM, Chris Angelico wrote:
I don't see the difference being subtle enough -- if an odd number of items is tossed in, that `next(i)` is still going to raise a StopIteration, which under PEP 479 will become a RunTimeError. Or did you mean that even numbered iterators will work fine, but odd-numbered ones will still raise? Nice. :) -- ~Ethan~

On Wed, Nov 26, 2014 at 11:46 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
Presumably the even case is the correct one. It's intended to work that way. If you give it an odd number of items, pre-479 they'll both silently terminate. (Post-479, the next(),next() one will always raise RuntimeError, which indicates clearly that it's not written appropriately, but that's not subtle.) ChrisA

On Tue, Nov 25, 2014 at 5:26 PM, <random832@fastmail.us> wrote:
That's not too pythonic, and trying to support non-pythonic code while evolving the language is a dead-end street. The web documents pythonic ways to obtain pairs from an iterable: def pairs(x): i = iter(x) return zip(i, i) Or even: def pairs(x): return zip(*[iter(x)]*2) The usual way of dealing with an odd number of elements is to use zip_longest. I don't remember seeing documented that raising StopIteration will cancel more than one iterator. If it's undocumented, then code that relies on the non-feature is broken. Cheers, -- Juancarlo *Añez*

On Wed, Nov 26, 2014 at 5:59 AM, <random832@fastmail.us> wrote:
Out of curiosity, what explicit uses of next are pythonic?
Ones immediately enclosed in try/except StopIteration, e.g.: try: x = next(it) print(x) except StopIteration: print('nothing') You could rewrite this particular one as follows: for x in it: print(x) break else: print('nothing') But if you have multiple next() calls you might be able to have a single try/except catching StopIteration from all of them, so the first pattern is more general. -- --Guido van Rossum (python.org/~guido)

On Wed, Nov 26, 2014 at 4:30 AM, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
I don't know much about the internal details of CPython either, but let's just ignore that for the moment and consider specs for the Python language. AFAIK, not one of the concerns raised (by PEP 479 or your proposal here) is CPython-specific.
Interesting. Makes a measure of sense, and doesn't have much magic to it.
Downside of this is that it's harder to consciously chain iterators, but maybe that's a cost that has to be paid. Suggestion for this: Have a new way of "raise-and-return". It's mostly like raise, except that (a) it can't be caught by a try/except block in the current function (because there's no point), and (b) it bypasses the "this exception must not pass unnoticed". It could then also be used for anything else that needs the "return any object, or signal lack of return value" option, covering AttributeError and so on. So it'd be something like this: class X: def __iter__(self): return self def __next__(self): if condition: return value signal StopIteration The 'signal' statement would promptly terminate the function (not sure exactly how it'd interact with context managers and try/finally, but something would be worked out), and then raise StopIteration in the calling function. Any other StopIteration which passes out of a function would become a RuntimeError. Magic required: Some way of knowing which exceptions should be covered by this ban on bubbling; also, preferably, some way to raise StopIteration in the calling function, without losing the end of the backtrace. This could be a viable proposal. It'd be rather more complicated than PEP 479, though, and would require a minimum of five hundred bikeshedding posts before it comes to any kind of conclusion, but if you feel this issue is worth it, I'd certainly be an interested participant in the discussion. ChrisA

On Tue, Nov 25, 2014 at 9:48 AM, Chris Angelico <rosuav@gmail.com> wrote:
It's not viable. It will break more code than PEP 479, and it will incur a larger interpreter overhead (every time any exception bubbles out of any frame we'd have to check whether it is (derived from) StopIteration and replace it, rather than only when exiting a generator frame. (The check for StopIteration is relatively expensive -- it's easy to determine that an exception *is* StopIteration, but in order that it doesn't derive from StopIteration you have to walk the inheritance tree.) Please stop panicking. -- --Guido van Rossum (python.org/~guido)
participants (23)
-
Alexander Belopolsky
-
Alexander Heger
-
Andrew Barnert
-
Barry Warsaw
-
Chris Angelico
-
Chris Barker
-
Devin Jeanpierre
-
Ethan Furman
-
Georg Brandl
-
Greg
-
Greg Ewing
-
Guido van Rossum
-
Ian Foote
-
Juancarlo Añez
-
Nick Coghlan
-
random832@fastmail.us
-
Raymond Hettinger
-
Rob Cliffe
-
Serhiy Storchaka
-
Steve Dower
-
Steven D'Aprano
-
Terry Reedy
-
Wolfgang Maier