[Python-ideas] Change how Generator Expressions handle StopIteration

Nick Coghlan ncoghlan at gmail.com
Tue Nov 11 13:21:12 CET 2014


On 7 November 2014 07:45, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Thu, 6 Nov 2014 10:54:51 -0800
> Guido van Rossum <guido at python.org> wrote:
> >
> > If I had had the right foresight, I would have made it an error to
> > terminate a generator with a StopIteration, probably by raising another
> > exception chained to the StopIteration (so the traceback shows the place
> > where the StopIteration escaped).
> >
> > The question at hand is if we can fix this post-hoc, using clever tricks
> > and (of course) a deprecation period.
>
> Is there any point in fixing it? Who relies on such borderline cases?
>

It's not about people relying on the current behaviour (it can't be, since
we're talking about *changing* that behaviour), it's about "Errors should
never pass silently". That is, the problematic cases that (at least
arguably) may be worth fixing are those where:

1. StopIteration escapes from an expression (Error!)
2. Instead of causing a traceback, it terminates a containing generator
(Passing silently!)

As asyncio coroutines become more popular, I predict some serious head
scratching from StopIteration escaping an asynchronous operation and
getting thrown into a coroutine, which then terminates with a "return None"
rather than propagating the exception as you might otherwise expect.

The problem with this particular style of bug is that the only trace it
leaves is a generator iterator that terminates earlier than expected -
there's no traceback, log message, or any other indication of where
something strange may be happening.

Consider the following, from the original post in the thread:

    def izip(*args):
        iters = [iter(obj) for obj in args]
        while True:
            yield tuple([next(it) for it in iters])

The current behaviour of that construct is that, as soon as one of the
iterators is empty:

1. next(it) throws StopIteration
2. the list comprehension unwinds the frame, and allows the exception to
propagate
3. the generator iterator unwinds the frame, and allows the exception to
propagate
4. the code invoking the iterator sees StopIteration and assumes iteration
is complete

If you switch to the generator expression version instead, the flow control
becomes:

1. next(it) throws StopIteration
2. the generator expression unwinds the frame, and allows the exception to
propagate
3. the iteration inside the tuple constructor sees StopIteration and halts
4. the generator iterator never terminates

In that code, "next(it)" is a flow control operation akin to break (it
terminates the nearest enclosing generator iterator, just as break
terminates the nearest enclosing loop), but it's incredibly unclear that
this is the case - there's no local indication that it may raise
StopIteration, you need to "just know" that raising StopIteration is a
possibility.

Guido's suggestion is to consider looking for a viable way to break the
equivalence between "return" and "raise StopIteration" in generator
iterators - that way, the only way for the above code to work would be
through a more explicit version that clearly tracks the flow control.

Option 1 would be to assume we use a new exception, and are OK with folks
catching it explicitly

    from __future__ import explicit_generator_return
    def izip(*args):
        iters = [iter(obj) for obj in args]
        while True:
            try:
                t = tuple(next(it) for it in iters)
            except UncaughtStopIteration:
                return # One of the iterators has been exhausted
            yield t

Option 2 would be to assume the new exception is something generic like
RuntimeError, requiring the inner loop to be converted to statement form:


    def izip(*args):
        iters = [iter(obj) for obj in args]
        while True:
            entry = []
            for it in iters:
                try:
                    item = next(it)
                except StopIteration:
                    return # One of the iterators has been exhausted
                entry.append(item)
            yield tuple(entry)

With option 2, you can also still rely on the fact that list comprehensions
don't create a generator frame:

    def izip(*args):
        iters = [iter(obj) for obj in args]
        while True:
            try:
                entry = [next(it) for it in iters]
            except StopIteration:
                return # One of the iterators has been exhausted
            yield tuple(entry)

The upside of the option 2 spellings is they'll work on all currently
supported versions of Python, while the downside is the extra object
construction they have to do if you want to yield something other than a
list.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20141111/7abf385e/attachment.html>


More information about the Python-ideas mailing list