[Python-Dev] Please reconsider PEP 479.

Chris Angelico rosuav at gmail.com
Wed Nov 26 12:59:25 CET 2014


On Wed, Nov 26, 2014 at 10:24 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The other key aspect is that it changes the answer to the question
> "How do I gracefully terminate a generator function?". The existing
> behaviour has an "or" in the answer: "return from the generator frame,
> OR raise StopIteration from the generator frame". That then leads to
> the follow on question: "When should I use one over the other?".
>
> The "from __future__ import generator_stop" answer drops the "or", so
> it's just: "return from the generator frame".

If I understand you correctly, you agree that this is a benefit, correct?

> The key downside is that it means relatively idiomatic code like:
>
>     def my_generator():
>         ...
>         yield next(it)
>         ...
>
> Now needs to be written out explicitly as:
>
>     def my_generator():
>         ...
>        try:
>             yield next(it)
>         except StopIteration
>             return
>         ...
>
> That's not especially easy to read, and it's also going to be very
> slow when working with generator based producer/consumer pipelines.

I'm not sure how often the ease-of-reading concern will come up, but I
can at least benchmark the performance of it. I have two
virtually-identical builds of CPython 3.5, one with and one without
the POC patch. There's no guarantee that this will properly match the
performance of the final product, as there'll likely be some
additional checks (especially when there's a __future__ directive to
concern ourselves with), but it's a start.

yield from: https://github.com/Rosuav/GenStopIter/blob/485d1/perftest.py
explicit loop: https://github.com/Rosuav/GenStopIter/blob/c071d/perftest.py

The numbers are pretty noisy, but I'm seeing about a 5% slowdown in
the 'yield from' version, with a recursion depth of 100 generators.
(Obviously less percentage slowdown with less depth, as other factors
have more impact.) Rewriting the loop to use an explicit try/except
and while loop roughly doubles the time taken (so clearly 'yield from'
is implemented very efficiently), and also destroys any meaning in the
numbers - the two interpreters come out effectively equal. My
preliminary conclusion is that there is some impact, but it's unlikely
to be significant in the real world. Do you have a more real-world
code example to try?

> After thinking about that concern for a while, I'd like to suggest the
> idea of having a new builtin "allow_implicit_stop" decorator that
> swaps out a GENERATOR code object that has the new "EXPLICIT_STOP"
> flag set for one with it cleared (attempting to apply
> "allow_implicit_stop" to a normal function would be an error).
>
> Then the updated version of the above example would become:
>
>     @allow_implicit_stop
>     def my_generator():
>         ...
>         yield next(it)
>         ...
>
> Which would be semantically equivalent to:
>
>     def my_generator():
>        try:
>            ...
>             yield next(it)
>             ...
>         except StopIteration
>             return
>
> but *much* faster (especially if used in a producer/consumer pipeline)
> since it would allow a single StopIteration instance to propagate
> through the entire pipeline, rather than creating and destroying new
> ones at each stage.

If the issue is performance, I would prefer to see something done with
a peephole optimizer instead: if there's a "try... except
StopIteration: return" construct, it's optimized away down to a magic
flag. That way, the code is guaranteed to be correct in all cases,
with no hidden behavioral changes, and this is _just_ a performance
optimization. I'd still rather see the exception-catching scope
narrowed as much as possible, though, which means not having something
that's semantically equivalent to wrapping the whole generator in
"try... except StopIteration: pass".

> P.S. While I'm less convinced this part is a good idea, if
> "allow_implicit_stop" accepted both generator functions *and*
> generator objects, then folks could even still explicitly opt in to
> the "or stop()" trick - and anyone reading the code would have a name
> to look up to see what was going on.

That would look somewhat thus, then:

use_an_iterable(allow_implicit_stop(x+1 for x in itr if x<10 or stop()))

I'm definitely not convinced that this would improve matters. However,
I don't currently have any recommendation for an "or stop()"
replacement, other than "refactor it into an explicitly-written
generator function and call it, then you can use return statements".
Suggestions welcomed.

ChrisA


More information about the Python-Dev mailing list