Re: [Python-ideas] PEP 479: Change StopIteration handling inside generators

15 Nov 2014

      On 16 November 2014 01:56, Chris Angelico  wrote:
...
On Sun, Nov 16, 2014 at 2:21 AM, Nick Coghlan  wrote:
...
On 16 November 2014 00:37, Chris Angelico  wrote:
...
On Sun, Nov 16, 2014 at 1:13 AM, Nick Coghlan 
wrote:
...
...
...
For certain situations, a simpler and fully backward-compatible
solution may be sufficient: when a generator returns, instead of
raising ``StopIteration``, it raises a specific subclass of
``StopIteration`` which can then be detected.  If it is not that
subclass, it is an escaping exception rather than a return statement.
There's an additional subtlety with this idea: if we add a new
GeneratorReturn exception as a subclass of StopIteration, then
generator
iterators would likely also have to change to replace GeneratorReturn
with a
regular StopIteration (chaining appropriately via __cause__, and
copying
the
return value across).
Would have to do so automatically, meaning this is no simpler than the
current proposal? Or would have to be always explicitly written to
handle it?
When GeneratorReturn escaped a generator frame, the interpreter would
automatically convert it into an ordinary StopIteration instance.
Okay, let me see if I have this straight. When a 'return' statement
(including an implicit one at end-of-function) is encountered in any
function which contains a 'yield' statement, it is implemented as
"raise GeneratorReturn(value)" rather than as "raise
StopIteration(value)" which is the current behaviour. However, if any
GeneratorReturn would be raised in any way other than the 'return'
statement, it would magically become a StopIteration instead. Is that
correct?
That's not quite how generators work. While the "returning from a generator
is equivalent to raise StopIteration" model is close enough that it's
functionally equivalent to the actual behaviour in most cases (with the
main difference being in how try/except blocks and context managers inside
the generator react), this particular PEP is a situation where it's
important to have a clear picture of the underlying details.

When you have a generator iterator (the thing you get back when calling a
generator function), there are two key components:

* the generator iterator object itself
* the generator frame where the code is running

When you call next(gi), you're invoking the __next__ method on the
*generator iterator*. It's that method which restarts evaluation of the
generator frame at the point where it last left off, and interprets any
results.

Now, there are three things that can happen as a result of that frame
evaluation:

    1. It hits a yield point. In that case, gi.__next__ returns the yielded
value.

    2. It can return from the frame. In that case. gi.__next__ creates a
*new* StopIteration instance (with an appropriate return value set) and
raises it

    3. It can throw an exception. In that case, gi.__next__ just allows it
to propagate out (including if it's StopIteration)

The following example illustrates the difference between cases 2 and 3 (in
both cases, there's a StopIteration that terminates the hidden loop inside
the list() call, the difference is in where that StopIteration is raised):
...
...
...
def genreturn():
...     yield
...     try:
...         return
...     except:
...         print("No exception")
...         raise
...
list(genreturn())
[None]
...
...
...
def genraise():
...     yield
...     try:
...         raise StopIteration
...     except:
...         print("Exception!")
...         raise
...
list(genraise())
Exception!
[None]
(The possible outcomes of gi.send() and gi.throw() are the same as those of
next(gi). gi.throw() has the novel variant where the exception thrown in
may propagate back out)

The two change proposals being discussed are as follows:

Current PEP (backwards incompatible): Change outcome 3 to convert
StopIteration to RuntimeError (or a new exception type). Nothing else
changes.

Alternative (backwards compatible): Change outcome 2 to raise
GeneratorReturn instead of StopIteration and outcome 3 to convert
GeneratorReturn to StopIteration.

The alternative *doesn't* do anything about the odd discrepancy between
comprehensions and generator expressions that started the previous thread.
It just adds a new capability where code that knows it's specifically
dealing with a generator (like contextlib or asyncio) can more easily tell
the difference between outcomes 2 and 3.
...
This does sound simpler. All the magic is in the boundary of the
generator itself, nothing more. If a __next__ method raises either
StopIteration or GeneratorReturn, or if any other function raises
them, there's no special handling.
All the magic is actually at the generator boundary regardless. The key
differences between the two proposals are the decision to keep
StopIteration as a common parent exception, and allow it to continue
propagating out of generator frames unmodified.
...
Question: How does it "become" StopIteration? Is a new instance of
StopIteration formed which copies in the other's ``value``? Is the
type of this exception magically altered? Or is it a brand new
exception with the __cause__ or __context__ set to carry the original?
I'd suggest used the exception chaining machinery and creating a new
exception with __cause__ and the generator return value set appropriately.
...
...
(I'm not familiar with
...
contextlib2 or what it offers.)
contexlib2 ~= 3.3 era contextlib that runs as far back as 2.6 (I
initially
created it as a proving ground for the idea that eventually become
contextlib.ExitStack).
Thanks, I figured it'd be like that. Since contextlib exists in 2.7,
is contextlib2 meant to be legacy support only?
contextlib has actually been around since 2.5, but some features (most
notably ExitStack) weren't added until much later. Like unittest2,
contextlib2 allows access to newer stdlib features on older versions (I
haven't used it as a testing ground for new ideas since ExitStack).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia