<div dir="ltr">For those who haven't followed along, here's the final text of PEP 479, with a brief Acceptance section added. The basic plan hasn't changed, but there's a lot more clarifying text and discussion of a few counter-proposals. Please send suggestions for editorial improvements to <a href="mailto:peps@python.org">peps@python.org</a>. The official reference version of the PEP is at <a href="https://www.python.org/dev/peps/pep-0479/">https://www.python.org/dev/peps/pep-0479/</a>; the repo is <a href="https://hg.python.org/peps/">https://hg.python.org/peps/</a> (please check out the repo and send diffs relative to the repo if you have edits).<br><br><br>PEP: 479<br>Title: Change StopIteration handling inside generators<br>Version: $Revision$<br>Last-Modified: $Date$<br>Author: Chris Angelico <<a href="mailto:rosuav@gmail.com">rosuav@gmail.com</a>>, Guido van Rossum <<a href="mailto:guido@python.org">guido@python.org</a>><br>Status: Accepted<br>Type: Standards Track<br>Content-Type: text/x-rst<br>Created: 15-Nov-2014<br>Python-Version: 3.5<br>Post-History: 15-Nov-2014, 19-Nov-2014, 5-Dec-2014<br><br><br>Abstract<br>========<br><br>This PEP proposes a change to generators: when ``StopIteration`` is<br>raised inside a generator, it is replaced it with ``RuntimeError``.<br>(More precisely, this happens when the exception is about to bubble<br>out of the generator's stack frame.) Because the change is backwards<br>incompatible, the feature is initially introduced using a<br>``__future__`` statement.<br><br><br>Acceptance<br>==========<br><br>This PEP was accepted by the BDFL on November 22. Because of the<br>exceptionally short period from first draft to acceptance, the main<br>objections brought up after acceptance were carefully considered and<br>have been reflected in the "Alternate proposals" section below.<br>However, none of the discussion changed the BDFL's mind and the PEP's<br>acceptance is now final. (Suggestions for clarifying edits are still<br>welcome -- unlike IETF RFCs, the text of a PEP is not cast in stone<br>after its acceptance, although the core design/plan/specification<br>should not change after acceptance.)<br><br><br>Rationale<br>=========<br><br>The interaction of generators and ``StopIteration`` is currently<br>somewhat surprising, and can conceal obscure bugs. An unexpected<br>exception should not result in subtly altered behaviour, but should<br>cause a noisy and easily-debugged traceback. Currently,<br>``StopIteration`` can be absorbed by the generator construct.<br><br>The main goal of the proposal is to ease debugging in the situation<br>where an unguarded ``next()`` call (perhaps several stack frames deep)<br>raises ``StopIteration`` and causes the iteration controlled by the<br>generator to terminate silently. (When another exception is raised, a<br>traceback is printed pinpointing the cause of the problem.)<br><br>This is particularly pernicious in combination with the ``yield from``<br>construct of PEP 380 [1]_, as it breaks the abstraction that a<br>subgenerator may be factored out of a generator. That PEP notes this<br>limitation, but notes that "use cases for these [are] rare to non-<br>existent". Unfortunately while intentional use is rare, it is easy to<br>stumble on these cases by accident::<br><br> import contextlib<br> <br> @contextlib.contextmanager<br> def transaction():<br> print('begin')<br> try:<br> yield from do_it()<br> except:<br> print('rollback')<br> raise<br> else:<br> print('commit')<br> <br> def do_it():<br> print('Refactored preparations')<br> yield # Body of with-statement is executed here<br> print('Refactored finalization')<br><br> def gene():<br> for i in range(2):<br> with transaction():<br> yield i<br> # return<br> raise StopIteration # This is wrong<br> print('Should not be reached')<br> <br> for i in gene():<br> print('main: i =', i)<br><br>Here factoring out ``do_it`` into a subgenerator has introduced a<br>subtle bug: if the wrapped block raises ``StopIteration``, under the<br>current behavior this exception will be swallowed by the context<br>manager; and, worse, the finalization is silently skipped! Similarly<br>problematic behavior occurs when an ``asyncio`` coroutine raises<br>``StopIteration``, causing it to terminate silently.<br><br>Additionally, the proposal reduces the difference between list<br>comprehensions and generator expressions, preventing surprises such as<br>the one that started this discussion [2]_. Henceforth, the following<br>statements will produce the same result if either produces a result at<br>all::<br><br> a = list(F(x) for x in xs if P(x))<br> a = [F(x) for x in xs if P(x)]<br><br>With the current state of affairs, it is possible to write a function<br>``F(x)`` or a predicate ``P(x)`` that causes the first form to produce<br>a (truncated) result, while the second form raises an exception<br>(namely, ``StopIteration``). With the proposed change, both forms<br>will raise an exception at this point (albeit ``RuntimeError`` in the<br>first case and ``StopIteration`` in the second).<br><br>Finally, the proposal also clears up the confusion about how to<br>terminate a generator: the proper way is ``return``, not<br>``raise StopIteration``.<br><br>As an added bonus, the above changes bring generator functions much<br>more in line with regular functions. If you wish to take a piece of<br>code presented as a generator and turn it into something else, you<br>can usually do this fairly simply, by replacing every ``yield`` with<br>a call to ``print()`` or ``list.append()``; however, if there are any<br>bare ``next()`` calls in the code, you have to be aware of them. If<br>the code was originally written without relying on ``StopIteration``<br>terminating the function, the transformation would be that much<br>easier.<br><br><br>Background information<br>======================<br><br>When a generator frame is (re)started as a result of a ``__next__()``<br>(or ``send()`` or ``throw()``) call, one of three outcomes can occur:<br><br>* A yield point is reached, and the yielded value is returned.<br>* The frame is returned from; ``StopIteration`` is raised.<br>* An exception is raised, which bubbles out.<br><br>In the latter two cases the frame is abandoned (and the generator<br>object's ``gi_frame`` attribute is set to None).<br><br><br>Proposal<br>========<br><br>If a ``StopIteration`` is about to bubble out of a generator frame, it<br>is replaced with ``RuntimeError``, which causes the ``next()`` call<br>(which invoked the generator) to fail, passing that exception out.<br>From then on it's just like any old exception. [4]_<br><br>This affects the third outcome listed above, without altering any<br>other effects. Furthermore, it only affects this outcome when the<br>exception raised is ``StopIteration`` (or a subclass thereof).<br><br>Note that the proposed replacement happens at the point where the<br>exception is about to bubble out of the frame, i.e. after any<br>``except`` or ``finally`` blocks that could affect it have been<br>exited. The ``StopIteration`` raised by returning from the frame is<br>not affected (the point being that ``StopIteration`` means that the<br>generator terminated "normally", i.e. it did not raise an exception).<br><br>A subtle issue is what will happen if the caller, having caught the<br>``RuntimeError``, calls the generator object's ``__next__()`` method<br>again. The answer is that it from this point on it will raise<br>``StopIteration`` -- the behavior is the same as when any other<br>exception was raised by the generator.<br><br>Another logical consequence of the proposal: if somone uses<br>``g.throw(StopIteration)`` to throw a ``StopIteration`` exception into<br>a generator, if the generator doesn't catch it (which it could do<br>using a ``try/except`` around the ``yield``), it will be transformed<br>into ``RuntimeError``.<br><br>During the transition phase, the new feature must be enabled<br>per-module using::<br><br> from __future__ import generator_stop<br><br>Any generator function constructed under the influence of this<br>directive will have the ``REPLACE_STOPITERATION`` flag set on its code<br>object, and generators with the flag set will behave according to this<br>proposal. Once the feature becomes standard, the flag may be dropped;<br>code should not inspect generators for it.<br><br><br>Consequences for existing code<br>==============================<br><br>This change will affect existing code that depends on<br>``StopIteration`` bubbling up. The pure Python reference<br>implementation of ``groupby`` [3]_ currently has comments "Exit on<br>``StopIteration``" where it is expected that the exception will<br>propagate and then be handled. This will be unusual, but not unknown,<br>and such constructs will fail. Other examples abound, e.g. [6]_, [7]_.<br><br>(Nick Coghlan comments: """If you wanted to factor out a helper<br>function that terminated the generator you'd have to do "return<br>yield from helper()" rather than just "helper()".""")<br><br>There are also examples of generator expressions floating around that<br>rely on a ``StopIteration`` raised by the expression, the target or the<br>predicate (rather than by the ``__next__()`` call implied in the ``for``<br>loop proper).<br><br>Writing backwards and forwards compatible code<br>----------------------------------------------<br><br>With the exception of hacks that raise ``StopIteration`` to exit a<br>generator expression, it is easy to write code that works equally well<br>under older Python versions as under the new semantics.<br><br>This is done by enclosing those places in the generator body where a<br>``StopIteration`` is expected (e.g. bare ``next()`` calls or in some<br>cases helper functions that are expected to raise ``StopIteration``)<br>in a ``try/except`` construct that returns when ``StopIteration`` is<br>raised. The ``try/except`` construct should appear directly in the<br>generator function; doing this in a helper function that is not itself<br>a generator does not work. If ``raise StopIteration`` occurs directly<br>in a generator, simply replace it with ``return``.<br><br><br>Examples of breakage<br>--------------------<br><br>Generators which explicitly raise ``StopIteration`` can generally be<br>changed to simply return instead. This will be compatible with all<br>existing Python versions, and will not be affected by ``__future__``.<br>Here are some illustrations from the standard library.<br><br>Lib/ipaddress.py::<br><br> if other == self:<br> raise StopIteration<br><br>Becomes::<br><br> if other == self:<br> return<br><br>In some cases, this can be combined with ``yield from`` to simplify<br>the code, such as Lib/difflib.py::<br><br> if context is None:<br> while True:<br> yield next(line_pair_iterator)<br><br>Becomes::<br><br> if context is None:<br> yield from line_pair_iterator<br> return<br><br>(The ``return`` is necessary for a strictly-equivalent translation,<br>though in this particular file, there is no further code, and the<br>``return`` can be omitted.) For compatibility with pre-3.3 versions<br>of Python, this could be written with an explicit ``for`` loop::<br><br> if context is None:<br> for line in line_pair_iterator:<br> yield line<br> return<br><br>More complicated iteration patterns will need explicit ``try/except``<br>constructs. For example, a hypothetical parser like this::<br><br> def parser(f):<br> while True:<br> data = next(f)<br> while True:<br> line = next(f)<br> if line == "- end -": break<br> data += line<br> yield data<br><br>would need to be rewritten as::<br><br> def parser(f):<br> while True:<br> try:<br> data = next(f)<br> while True:<br> line = next(f)<br> if line == "- end -": break<br> data += line<br> yield data<br> except StopIteration:<br> return<br><br>or possibly::<br><br> def parser(f):<br> for data in f:<br> while True:<br> line = next(f)<br> if line == "- end -": break<br> data += line<br> yield data<br><br>The latter form obscures the iteration by purporting to iterate over<br>the file with a ``for`` loop, but then also fetches more data from<br>the same iterator during the loop body. It does, however, clearly<br>differentiate between a "normal" termination (``StopIteration``<br>instead of the initial line) and an "abnormal" termination (failing<br>to find the end marker in the inner loop, which will now raise<br>``RuntimeError``).<br><br>This effect of ``StopIteration`` has been used to cut a generator<br>expression short, creating a form of ``takewhile``::<br><br> def stop():<br> raise StopIteration<br> print(list(x for x in range(10) if x < 5 or stop()))<br> # prints [0, 1, 2, 3, 4]<br><br>Under the current proposal, this form of non-local flow control is<br>not supported, and would have to be rewritten in statement form::<br><br> def gen():<br> for x in range(10):<br> if x >= 5: return<br> yield x<br> print(list(gen()))<br> # prints [0, 1, 2, 3, 4]<br><br>While this is a small loss of functionality, it is functionality that<br>often comes at the cost of readability, and just as ``lambda`` has<br>restrictions compared to ``def``, so does a generator expression have<br>restrictions compared to a generator function. In many cases, the<br>transformation to full generator function will be trivially easy, and<br>may improve structural clarity.<br><br><br>Explanation of generators, iterators, and StopIteration<br>=======================================================<br><br>Under this proposal, generators and iterators would be distinct, but<br>related, concepts. Like the mixing of text and bytes in Python 2,<br>the mixing of generators and iterators has resulted in certain<br>perceived conveniences, but proper separation will make bugs more<br>visible.<br><br>An iterator is an object with a ``__next__`` method. Like many other<br>special methods, it may either return a value, or raise a specific<br>exception - in this case, ``StopIteration`` - to signal that it has<br>no value to return. In this, it is similar to ``__getattr__`` (can<br>raise ``AttributeError``), ``__getitem__`` (can raise ``KeyError``),<br>and so on. A helper function for an iterator can be written to<br>follow the same protocol; for example::<br><br> def helper(x, y):<br> if x > y: return 1 / (x - y)<br> raise StopIteration<br><br> def __next__(self):<br> if self.a: return helper(self.b, self.c)<br> return helper(self.d, self.e)<br><br>Both forms of signalling are carried through: a returned value is<br>returned, an exception bubbles up. The helper is written to match<br>the protocol of the calling function.<br><br>A generator function is one which contains a ``yield`` expression.<br>Each time it is (re)started, it may either yield a value, or return<br>(including "falling off the end"). A helper function for a generator<br>can also be written, but it must also follow generator protocol::<br><br> def helper(x, y):<br> if x > y: yield 1 / (x - y)<br><br> def gen(self):<br> if self.a: return (yield from helper(self.b, self.c))<br> return (yield from helper(self.d, self.e))<br><br>In both cases, any unexpected exception will bubble up. Due to the<br>nature of generators and iterators, an unexpected ``StopIteration``<br>inside a generator will be converted into ``RuntimeError``, but<br>beyond that, all exceptions will propagate normally.<br><br><br>Transition plan<br>===============<br><br>- Python 3.5: Enable new semantics under ``__future__`` import; silent<br> deprecation warning if ``StopIteration`` bubbles out of a generator<br> not under ``__future__`` import.<br><br>- Python 3.6: Non-silent deprecation warning.<br><br>- Python 3.7: Enable new semantics everywhere.<br><br><br>Alternate proposals<br>===================<br><br>Raising something other than RuntimeError<br>-----------------------------------------<br><br>Rather than the generic ``RuntimeError``, it might make sense to raise<br>a new exception type ``UnexpectedStopIteration``. This has the<br>downside of implicitly encouraging that it be caught; the correct<br>action is to catch the original ``StopIteration``, not the chained<br>exception.<br><br><br>Supplying a specific exception to raise on return<br>-------------------------------------------------<br><br>Nick Coghlan suggested a means of providing a specific<br>``StopIteration`` instance to the generator; if any other instance of<br>``StopIteration`` is raised, it is an error, but if that particular<br>one is raised, the generator has properly completed. This subproposal<br>has been withdrawn in favour of better options, but is retained for<br>reference.<br><br><br>Making return-triggered StopIterations obvious<br>----------------------------------------------<br><br>For certain situations, a simpler and fully backward-compatible<br>solution may be sufficient: when a generator returns, instead of<br>raising ``StopIteration``, it raises a specific subclass of<br>``StopIteration`` (``GeneratorReturn``) which can then be detected.<br>If it is not that subclass, it is an escaping exception rather than a<br>return statement.<br><br>The inspiration for this alternative proposal was Nick's observation<br>[8]_ that if an ``asyncio`` coroutine [9]_ accidentally raises<br>``StopIteration``, it currently terminates silently, which may present<br>a hard-to-debug mystery to the developer. The main proposal turns<br>such accidents into clearly distinguishable ``RuntimeError`` exceptions,<br>but if that is rejected, this alternate proposal would enable<br>``asyncio`` to distinguish between a ``return`` statement and an<br>accidentally-raised ``StopIteration`` exception.<br><br>Of the three outcomes listed above, two change:<br><br>* If a yield point is reached, the value, obviously, would still be<br> returned.<br>* If the frame is returned from, ``GeneratorReturn`` (rather than<br> ``StopIteration``) is raised.<br>* If an instance of ``GeneratorReturn`` would be raised, instead an<br> instance of ``StopIteration`` would be raised. Any other exception<br> bubbles up normally.<br><br>In the third case, the ``StopIteration`` would have the ``value`` of<br>the original ``GeneratorReturn``, and would reference the original<br>exception in its ``__cause__``. If uncaught, this would clearly show<br>the chaining of exceptions.<br><br>This alternative does *not* affect the discrepancy between generator<br>expressions and list comprehensions, but allows generator-aware code<br>(such as the ``contextlib`` and ``asyncio`` modules) to reliably<br>differentiate between the second and third outcomes listed above.<br><br>However, once code exists that depends on this distinction between<br>``GeneratorReturn`` and ``StopIteration``, a generator that invokes<br>another generator and relies on the latter's ``StopIteration`` to<br>bubble out would still be potentially wrong, depending on the use made<br>of the distinction between the two exception types.<br><br><br>Converting the exception inside next()<br>--------------------------------------<br><br>Mark Shannon suggested [12]_ that the problem could be solved in<br>``next()`` rather than at the boundary of generator functions. By<br>having ``next()`` catch ``StopIteration`` and raise instead<br>``ValueError``, all unexpected ``StopIteration`` bubbling would be<br>prevented; however, the backward-incompatibility concerns are far<br>more serious than for the current proposal, as every ``next()`` call<br>now needs to be rewritten to guard against ``ValueError`` instead of<br>``StopIteration`` - not to mention that there is no way to write one<br>block of code which reliably works on multiple versions of Python.<br>(Using a dedicated exception type, perhaps subclassing ``ValueError``,<br>would help this; however, all code would still need to be rewritten.)<br><br><br>Sub-proposal: decorator to explicitly request current behaviour<br>---------------------------------------------------------------<br><br>Nick Coghlan suggested [13]_ that the situations where the current<br>behaviour is desired could be supported by means of a decorator::<br><br> from itertools import allow_implicit_stop<br><br> @allow_implicit_stop<br> def my_generator():<br> ...<br> yield next(it)<br> ...<br><br>Which would be semantically equivalent to::<br><br> def my_generator():<br> try:<br> ...<br> yield next(it)<br> ...<br> except StopIteration<br> return<br><br>but be faster, as it could be implemented by simply permitting the<br>``StopIteration`` to bubble up directly.<br><br>Single-source Python 2/3 code would also benefit in a 3.7+ world,<br>since libraries like six and python-future could just define their own<br>version of "allow_implicit_stop" that referred to the new builtin in<br>3.5+, and was implemented as an identity function in other versions.<br><br>However, due to the implementation complexities required, the ongoing<br>compatibility issues created, the subtlety of the decorator's effect,<br>and the fact that it would encourage the "quick-fix" solution of just<br>slapping the decorator onto all generators instead of properly fixing<br>the code in question, this sub-proposal has been rejected. [14]_<br><br><br>Criticism<br>=========<br><br>Unofficial and apocryphal statistics suggest that this is seldom, if<br>ever, a problem. [5]_ Code does exist which relies on the current<br>behaviour (e.g. [3]_, [6]_, [7]_), and there is the concern that this<br>would be unnecessary code churn to achieve little or no gain.<br><br>Steven D'Aprano started an informal survey on comp.lang.python [10]_;<br>at the time of writing only two responses have been received: one was<br>in favor of changing list comprehensions to match generator<br>expressions (!), the other was in favor of this PEP's main proposal.<br><br>The existing model has been compared to the perfectly-acceptable<br>issues inherent to every other case where an exception has special<br>meaning. For instance, an unexpected ``KeyError`` inside a<br>``__getitem__`` method will be interpreted as failure, rather than<br>permitted to bubble up. However, there is a difference. Special<br>methods use ``return`` to indicate normality, and ``raise`` to signal<br>abnormality; generators ``yield`` to indicate data, and ``return`` to<br>signal the abnormal state. This makes explicitly raising<br>``StopIteration`` entirely redundant, and potentially surprising. If<br>other special methods had dedicated keywords to distinguish between<br>their return paths, they too could turn unexpected exceptions into<br>``RuntimeError``; the fact that they cannot should not preclude<br>generators from doing so.<br><br><br>References<br>==========<br><br>.. [1] PEP 380 - Syntax for Delegating to a Subgenerator<br> (<a href="https://www.python.org/dev/peps/pep-0380">https://www.python.org/dev/peps/pep-0380</a>)<br><br>.. [2] Initial mailing list comment<br> (<a href="https://mail.python.org/pipermail/python-ideas/2014-November/029906.html">https://mail.python.org/pipermail/python-ideas/2014-November/029906.html</a>)<br><br>.. [3] Pure Python implementation of groupby<br> (<a href="https://docs.python.org/3/library/itertools.html#itertools.groupby">https://docs.python.org/3/library/itertools.html#itertools.groupby</a>)<br><br>.. [4] Proposal by GvR<br> (<a href="https://mail.python.org/pipermail/python-ideas/2014-November/029953.html">https://mail.python.org/pipermail/python-ideas/2014-November/029953.html</a>)<br><br>.. [5] Response by Steven D'Aprano<br> (<a href="https://mail.python.org/pipermail/python-ideas/2014-November/029994.html">https://mail.python.org/pipermail/python-ideas/2014-November/029994.html</a>)<br><br>.. [6] Split a sequence or generator using a predicate<br> (<a href="http://code.activestate.com/recipes/578416-split-a-sequence-or-generator-using-a-predicate/">http://code.activestate.com/recipes/578416-split-a-sequence-or-generator-using-a-predicate/</a>)<br><br>.. [7] wrap unbounded generator to restrict its output<br> (<a href="http://code.activestate.com/recipes/66427-wrap-unbounded-generator-to-restrict-its-output/">http://code.activestate.com/recipes/66427-wrap-unbounded-generator-to-restrict-its-output/</a>)<br><br>.. [8] Post from Nick Coghlan mentioning asyncio<br> (<a href="https://mail.python.org/pipermail/python-ideas/2014-November/029961.html">https://mail.python.org/pipermail/python-ideas/2014-November/029961.html</a>)<br><br>.. [9] Coroutines in asyncio<br> (<a href="https://docs.python.org/3/library/asyncio-task.html#coroutines">https://docs.python.org/3/library/asyncio-task.html#coroutines</a>)<br><br>.. [10] Thread on comp.lang.python started by Steven D'Aprano<br> (<a href="https://mail.python.org/pipermail/python-list/2014-November/680757.html">https://mail.python.org/pipermail/python-list/2014-November/680757.html</a>)<br><br>.. [11] Tracker issue with Proof-of-Concept patch<br> (<a href="http://bugs.python.org/issue22906">http://bugs.python.org/issue22906</a>)<br><br>.. [12] Post from Mark Shannon with alternate proposal<br> (<a href="https://mail.python.org/pipermail/python-dev/2014-November/137129.html">https://mail.python.org/pipermail/python-dev/2014-November/137129.html</a>)<br><br>.. [13] Idea from Nick Coghlan<br> (<a href="https://mail.python.org/pipermail/python-dev/2014-November/137201.html">https://mail.python.org/pipermail/python-dev/2014-November/137201.html</a>)<br><br>.. [14] Rejection by GvR<br> (<a href="https://mail.python.org/pipermail/python-dev/2014-November/137243.html">https://mail.python.org/pipermail/python-dev/2014-November/137243.html</a>)<br><br>Copyright<br>=========<br><br>This document has been placed in the public domain.<br><br><br><br>..<br> Local Variables:<br> mode: indented-text<br> indent-tabs-mode: nil<br> sentence-end-double-space: t<br> fill-column: 70<br> coding: utf-8<br> End:<br><br><br clear="all"><div><div><br>-- <br><div class="gmail_signature">--Guido van Rossum (<a href="http://python.org/~guido">python.org/~guido</a>)</div>
</div></div></div>