Bruce Frederiksen wrote:
If the subiterator happens to be another generator, dropping the last reference to it will cause it to be closed, [...] NO, NO, NO. Unless you are prepared to say that programs written to
Greg Ewing wrote: [...] this spec are *not* expected to run on any other version of Python other than CPython. CPython is the *only* version with a reference counting collector. And encouraging Python programmers to rely on this invites trouble when they try to port to any other version of Python. I know. I've been there, and have the T-shirt. And it's not pretty. The errors that you get when your finally clauses and context managers aren't run can be quite mysterious. And God help that person if they haven't slept with PEP 342 under their pillow! Ok, got it. Relying on refcounting is bad.
[...]
It would also avoid the problem of a partially exhausted iterator that's still in use by something else getting prematurely finalized, which is another thing that's been bothering me. This is a valid point. But consider:
1. The delegating generator has no way to stop the subgenerator prematurely when it uses the yield from. So the yield from can only be stopped prematurely by the delegating generator's caller. And then the subgenerator would have to be communicated between the caller to the delegating generator somehow (e.g, passed in as a parameter) so that the caller could continue to use it. (And the subgenerator has to be a generator, not a plain iterator). "...subgenerator has to be a generator" is not entirely true. For example, if the subiterator doesn't have send, you can send a non-None value to the generator and that will raise an AttributeError at the yield from. If it doesn't have throw, you can even throw a StopIteration with a value to get that value as the result of the yield-from expression, which might be useful in a twisted sort of way. In both cases, the subiterator will only be closed if the yield-from expression actually closes it. So it is definitely possible to get a non-generator prematurely finalized.
Though possible, this kind of a use case would be used very rarely compared to the use case of the yield from being the final place the subgenerator is used. That I agree with.
2. If finalization of the subgenerator needs to be prevented, it can be wrapped in a plain iterator wrapper that doesn't define throw or close.
class no_finalize: def __init__(self, gen): self.gen = gen def __iter__(self): return self def __next__(self): return next(self.gen) def send(self, x): return self.gen.send(x)
g = subgen(...) yield from no_finalize(g) ... use g
Well, if the subiterator is a generator that itself uses yield-from, the need to wrap it would destroy all possible speed benefits of using yield-from. So if there *is* a valid use case for yielding from a shared generator, this is not really a solution unless you don't care about speed.
As I see it, you are faced with two options:
1. Define "yield from" in a way that it will work the same in all implementations of Python and will work for the 98% use case without any extra boilerplate code, and only require extra boilerplate (as above) for the 2% use case. or
I can live with that. This essentially means using the expansion in the PEP (with "except Exception, _e" replaced by "except BaseException, _e", to get the inlining property we all want). The decision to use explicit close will make what could have been a 2% use case much less attractive. Note that with explicit close, my argument for special-casing GeneratorExit by adding "except GeneratorExit: raise" weakens. The GeneratorExit will be delegated to the deepest generator/iterator with a throw method. As long as the iterators don't swallow the exception, they will be closed from the finally clause in the expansion. If one of them *does* swallow the exception, the outermost generator will raise a RuntimeError. The only difference that special-casing GeneratorExit would make is that 1) if the final iterator is not a generator, it won't see a GeneratorExit, and 2) if one of the iterators swallow the exception, the rest would still be closed and you might get a better traceback for the RuntimeError.
2. Define "yield from" in a way that will have quite different behavior (for reasons very obscure to most programmers) on the different implementations of Python (due to the different implementation of garbage collectors), require boilerplate code to be portable for the 98% use case (e.g., adding a "with closing(subgen())" around the yield from); but not require any boilerplate code for portability in the 2% use case.
The only argument I can think in favor of option 2, is that's what the "for" statement ended up with. But that was only because changing the "for" statement to option 1 would break the legacy 2% use cases...
There is also the question of speed as mentioned above, but that argument is not all that strong...
IMHO option 1 is the better choice.
If relying on refcounting is as bad as you say, then I agree. - Jacob