[Python-ideas] Yield-From: GeneratorExit?

Jacob Holm jh at improva.dk
Mon Mar 23 14:09:07 CET 2009


Bruce Frederiksen wrote:
> Greg Ewing wrote:
> [...]
>> If the subiterator happens to be another generator,
>> dropping the last reference to it will cause it to
>> be closed, [...] 
> NO, NO, NO. Unless you are prepared to say that programs written to 
> this spec are *not* expected to run on any other version of Python 
> other than CPython. CPython is the *only* version with a reference 
> counting collector. And encouraging Python programmers to rely on this 
> invites trouble when they try to port to any other version of Python. 
> I know. I've been there, and have the T-shirt. And it's not pretty. 
> The errors that you get when your finally clauses and context managers 
> aren't run can be quite mysterious. And God help that person if they 
> haven't slept with PEP 342 under their pillow!
Ok, got it. Relying on refcounting is bad.

> [...]
>> It would also avoid the problem of a partially exhausted
>> iterator that's still in use by something else getting
>> prematurely finalized, which is another thing that's been
>> bothering me. 
> This is a valid point. But consider:
>
> 1. The delegating generator has no way to stop the subgenerator 
> prematurely when it uses the yield from. So the yield from can only be 
> stopped prematurely by the delegating generator's caller. And then the 
> subgenerator would have to be communicated between the caller to the 
> delegating generator somehow (e.g, passed in as a parameter) so that 
> the caller could continue to use it. (And the subgenerator has to be a 
> generator, not a plain iterator). 
"...subgenerator has to be a generator" is not entirely true. For 
example, if the subiterator doesn't have send, you can send a non-None 
value to the generator and that will raise an AttributeError at the 
yield from. If it doesn't have throw, you can even throw a StopIteration 
with a value to get that value as the result of the yield-from 
expression, which might be useful in a twisted sort of way. In both 
cases, the subiterator will only be closed if the yield-from expression 
actually closes it. So it is definitely possible to get a non-generator 
prematurely finalized.

> Though possible, this kind of a use case would be used very rarely 
> compared to the use case of the yield from being the final place the 
> subgenerator is used.
That I agree with.

>
> 2. If finalization of the subgenerator needs to be prevented, it can 
> be wrapped in a plain iterator wrapper that doesn't define throw or 
> close.
>
> class no_finalize:
> def __init__(self, gen):
> self.gen = gen
> def __iter__(self):
> return self
> def __next__(self):
> return next(self.gen)
> def send(self, x):
> return self.gen.send(x)
>
> g = subgen(...)
> yield from no_finalize(g)
> ... use g
Well, if the subiterator is a generator that itself uses yield-from, the 
need to wrap it would destroy all possible speed benefits of using 
yield-from. So if there *is* a valid use case for yielding from a shared 
generator, this is not really a solution unless you don't care about speed.

>
> As I see it, you are faced with two options:
>
> 1. Define "yield from" in a way that it will work the same in all 
> implementations of Python and will work for the 98% use case without 
> any extra boilerplate code, and only require extra boilerplate (as 
> above) for the 2% use case. or
I can live with that. This essentially means using the expansion in the 
PEP (with "except Exception, _e" replaced by "except BaseException, _e", 
to get the inlining property we all want). The decision to use explicit 
close will make what could have been a 2% use case much less attractive.

Note that with explicit close, my argument for special-casing 
GeneratorExit by adding "except GeneratorExit: raise" weakens. The 
GeneratorExit will be delegated to the deepest generator/iterator with a 
throw method. As long as the iterators don't swallow the exception, they 
will be closed from the finally clause in the expansion. If one of them 
*does* swallow the exception, the outermost generator will raise a 
RuntimeError. The only difference that special-casing GeneratorExit 
would make is that 1) if the final iterator is not a generator, it won't 
see a GeneratorExit, and 2) if one of the iterators swallow the 
exception, the rest would still be closed and you might get a better 
traceback for the RuntimeError.

>
> 2. Define "yield from" in a way that will have quite different 
> behavior (for reasons very obscure to most programmers) on the 
> different implementations of Python (due to the different 
> implementation of garbage collectors), require boilerplate code to be 
> portable for the 98% use case (e.g., adding a "with closing(subgen())" 
> around the yield from); but not require any boilerplate code for 
> portability in the 2% use case.
>
> The only argument I can think in favor of option 2, is that's what the 
> "for" statement ended up with. But that was only because changing the 
> "for" statement to option 1 would break the legacy 2% use cases...
There is also the question of speed as mentioned above, but that 
argument is not all that strong...

>
> IMHO option 1 is the better choice.
If relying on refcounting is as bad as you say, then I agree.

- Jacob



More information about the Python-ideas mailing list