On 15 November 2014 19:29, Chris Angelico <rosuav@gmail.com> wrote:
PEP: 479
Title: Change StopIteration handling inside generators
Version: $Revision$
Last-Modified: $Date$
Author: Chris Angelico <rosuav@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 15-Nov-2014
Python-Version: 3.5
Post-History: 15-Nov-2014


Abstract
========

This PEP proposes a semantic change to ``StopIteration`` when raised
inside a generator, unifying the behaviour of list comprehensions and
generator expressions somewhat.


Rationale
=========

The interaction of generators and ``StopIteration`` is currently
somewhat surprising, and can conceal obscure bugs.  An unexpected
exception should not result in subtly altered behaviour, but should
cause a noisy and easily-debugged traceback.  Currently,
``StopIteration`` can be absorbed by the generator construct.

Thanks for the write-up!


Proposal
========

If a ``StopIteration`` is about to bubble out of a generator frame, it
is replaced with some other exception (maybe ``RuntimeError``, maybe a
new custom ``Exception`` subclass, but *not* deriving from
``StopIteration``) which causes the ``next()`` call (which invoked the
generator) to fail, passing that exception out.  From then on it's
just like any old exception. [3]_

[snip]
 
Alternate proposals
===================

Supplying a specific exception to raise on return
-------------------------------------------------

Nick Coghlan suggested a means of providing a specific
``StopIteration`` instance to the generator; if any other instance of
``StopIteration`` is raised, it is an error, but if that particular
one is raised, the generator has properly completed.

I think you can skip mentioning this particular idea in the PEP - I didn't like it even when I posted it, and both of Guido's ideas are much better :)
 
Making return-triggered StopIterations obvious
----------------------------------------------

For certain situations, a simpler and fully backward-compatible
solution may be sufficient: when a generator returns, instead of
raising ``StopIteration``, it raises a specific subclass of
``StopIteration`` which can then be detected.  If it is not that
subclass, it is an escaping exception rather than a return statement.

There's an additional subtlety with this idea: if we add a new GeneratorReturn exception as a subclass of StopIteration, then generator iterators would likely also have to change to replace GeneratorReturn with a regular StopIteration (chaining appropriately via __cause__, and copying the return value across).

From the point of view of calling "next()" directly (rather than implicitly) this particular change makes it straightforward to distinguish between "the generator I called just finished" and "something inside the generator threw StopIteration". Due to the subclassing, implict next() invocations (e.g. in for loops, comprehensions, and container constructors) won't notice any difference.

With such a change, we would actually likely modify the following code in contextlib._GeneratorContextManager.__exit__:

            try:
                self.gen.throw(exc_type, value, traceback)
                raise RuntimeError("generator didn't stop after throw()")
            except StopIteration as exc:
                # Generator suppressed the exception
                # unless it's a StopIteration instance we threw in
                return exc is not value
            except:
                if sys.exc_info()[1] is not value:
                    raise

To be the slightly more self-explanatory:

            try:
                self.gen.throw(type, value, traceback)
                raise RuntimeError("generator didn't stop after throw()")
            except GeneratorReturn:
                # Generator suppressed the exception
                return True
            except:
                if sys.exc_info()[1] is not value:
                    raise

The current proposal in the PEP actually doesn't let us simplify this contextlib code, but rather means we would have to make it more complicated to impedance match generator semantics with the context management protocol. To handle that change, we'd have to make the code something like the following (for clarity, I've assumed a new RuntimeError subclass, rather than RuntimeError itself):

            try:
                self.gen.throw(exc_type, value, traceback)
                raise RuntimeError("generator didn't stop after throw()")
            except StopIteration as exc:
                # Could becomes "return True" once the __future__ becomes the default
                return exc is not value
            except UnexpectedStopIteration as exc:
                if exc.__cause__ is not value:
                    raise
            except:
                if sys.exc_info()[1] is not value:
                    raise

I definitely see value in adding a GeneratorReturn subclass to be able to tell the "returned" vs "raised StopIteration" cases apart from outside the generator (the current dance in contextlib only works because we have existing knowledge of the exact exception that was thrown in). I'm substantially less convinced of the benefit of changing generators to no longer suppress StopIteration. Yes, it's currently a rather odd corner case, but changing it *will* break code (at the very least, anyone using an old version of contextlib2, or who are otherwise relying on their own copy of contextlib rather than standard library one).

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia