
this is something we discussed with Guido, and also Moshe Zadka at Europython. Guido thought it seems reasonable enough, if the details can be nailed. I have written it down so the idea doesn't get lost, for the moment is more a matter of whether it can get a number, and then it can go dormant for a while. - * - PEP: XXX Title: Resource-Release Support for Generators Version: $Revision$ Last-Modified: $Date$ Author: Samuele Pedroni <pedronis@python.org> Status: Draft Type: Standards Track Content-Type: text/plain Created: 25-Aug-2003 Python-Version: 2.4 Post-History: Abstract Generators allow for natural coding and abstraction of traversal over data. Currently if external resources needing proper timely release are involved, generators are unfortunately not adequate. The typical idiom for timely release is not supported, a yield statement is not allowed in the try clause of a try-finally statement inside a generator. The finally clause execution cannot be either guaranteed or enforced. This PEP proposes that generators support a close method and destruction semantics, such that the restriction can be lifted, expanding the applicability of generators. Rationale Python generators allow for natural coding of many data traversal scenarios Their instantiation produces iterators, i.e. first-class objects abstracting traversal (with all the advantages of first- classness). In this respect they match in power and offer some advantages over the approach using iterator methods taking a (smalltalkish) block. On the other hand, given current limitations (no yield allowed in a try clause of a try-finally inside a generator) the latter approach seems better suited at encapsulating not only traversal but also exception handling and proper resource acquisition and release. Let's consider an example (for simplicity, files in read-mode are used): def all_lines(index_path): for path in file(index_path,"r"): for line in file(path.strip(),"r"): yield line this is short and to the point, but the try-finally for timely closing of the files cannot be added. (While instead of a path, a file, whose closing then would be responsibility of the caller, could be passed in as argument, the same is not applicable for the files opened depending on the contents of the index). If we want timely release, we have to sacrifice the simplicity and directness of the generator-only approach: (e.g.) class AllLines: def __init__(self,index_path): self.index_path = index_path self.index = None self.document = None def __iter__(self): self.index = file(self.index_path,"r") for path in self.index: self.document = file(path.strip(),"r") for line in self.document: yield line self.document.close() self.document = None def close(self): if self.index: self.index.close() if self.document: self.document.close() to be used as: all_lines = AllLines("index.txt") try: for line in all_lines: ... finally: all_lines.close() The more convoluted solution implementing timely release, seems to offer a precious hint. What we have done is encapsulating our traversal in an object (iterator) with a close method. This PEP proposes that generators should grow such a close method with such semantics that the example could be rewritten as: def all_lines(index_path): index = file(index_path,"r") try: for path in file(index_path,"r"): document = file(path.strip(),"r") try: for line in document: yield line finally: document.close() finally: index.close() all = all_lines("index.txt") try: for line in all: ... finally: all.close() PEP 255 [1] disallows yield inside a try clause of a try-finally statement, because the execution of the finally clause cannot be guaranteed as required by try-finally semantics. The semantics of the proposed close method should be such, that while the finally clause execution still cannot be guaranteed, it can be enforced when required. The semantics of generator destruction on the other hand should be extended in order to implement a best-effort policy for the general case. This strikes as a reasonable compromise, the resulting global behavior being similar to that of files and closing. Possible Semantics A close() method should be implemented for generator objects. 1) If a generator is already terminated, close should be a no-op. Otherwise: (two alternative solutions) (Return Semantics) The generator should be resumed, generator execution should continue as if the instruction at re-entry point is a return. Consequently finally clauses surrounding the re-entry point would be executed, in the case of a then allowed try-yield-finally pattern. Issues: is it important to be able to distinguish forced termination by close, normal termination, exception propagation from generator or generator-called code? In the normal case it seems not, finally clauses should be there to work the same in all these cases, still this semantics could make such a distinction hard. Except-clauses, like by a normal return, are not executed, such clauses in legacy generators expect to be executed for exceptions raised by the generator or by code called from it. Not executing them in the close case seems correct. OR (Exception Semantics) The generator should be resumed and execution should continue as if a special-purpose exception (e.g. CloseGenerator) has been raised at re-entry point. Close implementation should consume and not propagate further this exception. Issues: should StopIteration be reused for this purpose? Probably not. We would like close to be a harmless operation for legacy generators, which could contain code catching StopIteration to deal with other generators/iterators. In general, with exception semantics, it is unclear what to do if the generator does not terminate or we do not receive the special exception propagated back. Other different exceptions should probably be propagated, but consider this possible legacy generator code: try: ... yield ... ... except: # or except Exception:, etc raise Exception("boom") If close is invoked with the generator suspended after the yield, the except clause would catch our special purpose exception, so we would get a different exception propagated back, which in this case ought to be reasonably consumed and ignored but in general should be propagated, but separating these scenarios seem hard. The exception approach has the advantage to let the generator distinguish between termination cases and have more control. On the other hand clear-cut semantics seem harder to define. 2) Generator destruction should invoke close method behavior. Remarks If this proposal is accepted, it should become common practice to document whether a generator acquires resources, so that its close method ought to be called. If a generator is no longer used, calling close should be harmless. On the other hand, in the typical scenario the code that instantiated the generator should call close if required by it, generic code dealing with iterators/generators instantiated elsewhere should typically not be littered with close calls. The rare case of code that has acquired ownership of and need to properly deal with all of iterators, generators and generators acquiring resources that need timely release, is easily solved: if hasattr(iterator,'close'): iterator.close() Open Issues Definitive semantics ought to be chosen, implementation issues should be explored. Alternative Ideas The idea that the yield placement limitation should be removed and that generator destruction should trigger execution of finally clauses has been proposed more than once. Alone it cannot guarantee that timely release of resources acquired by a generator can be enforced. PEP 288 [2] proposes a more general solution, allowing custom exception passing to generators. References [1] PEP 255 Simple Generators http://www.python.org/peps/pep-0255.html [2] PEP 288 Generators Attributes and Exceptions http://www.python.org/peps/pep-0288.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:

[snip]
PEP 288 [2] proposes a more general solution, allowing custom exception passing to generators.
Is there any reason to prefer gen.close() over the more general solution gen.throw(Close) which results in nearly identical code and yet allows other exception types to be handled as well? Note, the general purpose solution is a natural extension of the existing syntax and is easily implemented without messing with 'try/finally'. Pretty much all that was holding up the general solution was that I had not convinced Guido that the clean-up problem exists in practice. It looks like you've surmounted that obstacle for me. Raymond Hettinger

After re-skimming PEP 288 I'm still not convinced that a more general problem exists for which .close() isn't sufficient. The one motivating example there (writing a log file) seems forced and can be done in other ways.
Note, the general purpose solution is a natural extension of the existing syntax and is easily implemented without messing with 'try/finally'.
I don't understand this remark. AFAICT PEP 288 doesn't propose new syntax, only a new method and its semantics. And I don't see Samuele's solution as "messing with try/finally".
But you still haven't convinced me of the need for the more generalized PEP 288 mechanism. I do think that the possibility of implementing PEP 288 in the future suggests that Samuele's .close() should be implemented in terms of a special exception, not in terms of a 'return'. The spec needs to define clearly what should happen if the generator catches and ignores the exception, e.g.: def forever(): while True: try: yield None except: pass f = forever() f.next() f.close() Clearly at this point the generator reaches the yield again. What should happen then? Should it suspend so that a subsequent call to f.next() can receive another value? Or should reaching yield after the generator is closed raise another exception? I'm leaning towards the latter, despite the fact that it will cause an infinite loop in this case -- that's no different when you have a print statement instead of a yield statement. (Mentally substituting a print for a yield is often a useful way to think about a generator. The reverse can also be useful to consider converting a non-generator to a generator: if it prints a sequence of values, it can also yield the same sequence.) Another comment on Samuele's PEP: It is sort of sad that the *user* of a generator has to know that the generator's close() must be called. Normally, the beauty of using a try/finally for cleanup is that your callers don't need to know about it. But I see no way around this. And this is still an argument that pleads against the whole thing, either PEP 288 or Samuele's smaller variant: the usual near-guarantee that code in a finally clause will be executed no matter what (barring fatal errors, os._exit() or os.execv()) does not apply. And this was the original argument against allowing yield inside try/finally. But the need for cleanup is also clear, so I like Samuele's KISS compromise. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 13:50 26.08.2003 -0700, Guido van Rossum wrote:
also try: ... yield ... except: # or except Exception raise a new different exception is not a clear-cut situation. close should probably propagate exceptions different from CloseGenerator but here the right thing to do is fuzzy. It also another case of the problems of too broad-catching except clauses, in some sense the exception generated by close is like an async exception like KeyboardInterrupt.
yes, we cannot totally win.

The problem is that the language definition doesn't define *when* GC happens, if at all. E.g. in Jython this depends on the Java GC. And even in CPython, it's easily conceivable that an accidental cycle in the user data structures prevents collection. So there is a need for explicit control in some cases.
Back to lurking,
Good idea. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

Hello, (My reply didn't seem to have reached python-dev... second try.) On Tue, Aug 26, 2003 at 01:50:52PM -0700, Guido van Rossum wrote:
What about letting the 'for' handle this? It's the most common way generators are used. When a 'for' loop on a generator-iterator finishes it would call the close() method of the iterator, which on generators would (say) simulate a return from the latest 'yield'. Well, close() might not be such a good name because it would probably break exsiting code (e.g. closing files unexpectedly), but __exit__() might do. In other words we could import some of the proposed functionality of the 'with' keyword (PEP 310) into 'for'. I think it makes sense because 'for' is already defined in term of implicit calls to next(), the only method of iterators; so if a second method is added, 'for' can be taught about it too. Armin

At 14:46 13.09.2003 +0100, Armin Rigo wrote:
I expect generators to grow also a __exit__ if they grow a close and PEP 310 is accepted, this idiom will then be common with g = gen() for v in g: ... but it will be common for files and file-like too. It can be reasonable to discuss whether we want a special syntax then that conflates 'with' and 'for' behavior. But we can experiment a while before seeing whether writing the idiom is unbearably tedious ;). OTOH conflating 'with' and 'for' just for generators seems a rather ad-hoc breaking of orthoganility of the two, you could not write anymore code like this: g = gen() for v in g: ... do something up to a point ... ... for v in g: ... now this is rare but still breaking orthoganility of primitives is something I would think twice about. regards.

Hello Samuele, On Sat, Sep 13, 2003 at 04:18:51PM +0200, Samuele Pedroni wrote:
I had thought about this. This occurs when you 'break' out of the first loop. I would say that NOT calling the __exit__() method in this specific case seems quite intuitive, the 'break' meaning 'just exit from the loop now without any further processing, skipping the 'else' part if present'. Armin

At 11:48 AM 9/15/03 +0100, Armin Rigo wrote:
Hmmm... You realize this is also going to break the similarity between: i = iter(something) while 1: try: j=i.next() except StopIteration: break and for j in iter(something): pass The former is already complex enough as a mental model. I think it gets altogether too complex if one also has to consider the enter/leave issue. This strikes me as an area like try/except vs. try/finally: it really should be a separate block, just for the sake of explicitness. As much as it would be cool to have the automatic release, I think I'd rather use: with i = iter(something): for j in i: ... And make the resource management very visible.

Hello Phillip, On Mon, Sep 15, 2003 at 11:26:27AM -0400, Phillip J. Eby wrote:
Makes sense. Then let's make sure, if both 'with' and 'yield within try:finally' are accepted, that they are compatible, e.g. by having and __exit__() method on generators (and not just a close()). Armin

this was written before reading Guido's last comments. At 16:17 26.08.2003 -0400, Raymond Hettinger wrote:
1) I think we want to enable try-finally, because is the typical way to spell resource release: f = file(...,"r") try: ... except GeneratorClose: f.close() return else: f.close() or f = file(...,"r") fastexit = 0 try: ... except GeneratorClose: fastexit = 0 f.close() if fastexit: return 2) even if we had gen.throw(...), I think it would be better to have explicitly gen.close(), it expresses intention better IMO and feels like file.close() etc... 3) for the purpose of close, it seems that forced-return vs. throwing an exception on generator side, can have more clearly definable semantics, although it has some limitations too. so the question is whether there are use cases for the more general gen.throw(...) different from gen.close() purpose, and if we have it whether we can layer a gen.close() with the proper semantics on top of it, i.e. we should then clearly think about the issues for exceptions-semantics for gen.close(). gen.throw is a bigger gun but they don't kill one another regards.

[snip]
PEP 288 [2] proposes a more general solution, allowing custom exception passing to generators.
Is there any reason to prefer gen.close() over the more general solution gen.throw(Close) which results in nearly identical code and yet allows other exception types to be handled as well? Note, the general purpose solution is a natural extension of the existing syntax and is easily implemented without messing with 'try/finally'. Pretty much all that was holding up the general solution was that I had not convinced Guido that the clean-up problem exists in practice. It looks like you've surmounted that obstacle for me. Raymond Hettinger

After re-skimming PEP 288 I'm still not convinced that a more general problem exists for which .close() isn't sufficient. The one motivating example there (writing a log file) seems forced and can be done in other ways.
Note, the general purpose solution is a natural extension of the existing syntax and is easily implemented without messing with 'try/finally'.
I don't understand this remark. AFAICT PEP 288 doesn't propose new syntax, only a new method and its semantics. And I don't see Samuele's solution as "messing with try/finally".
But you still haven't convinced me of the need for the more generalized PEP 288 mechanism. I do think that the possibility of implementing PEP 288 in the future suggests that Samuele's .close() should be implemented in terms of a special exception, not in terms of a 'return'. The spec needs to define clearly what should happen if the generator catches and ignores the exception, e.g.: def forever(): while True: try: yield None except: pass f = forever() f.next() f.close() Clearly at this point the generator reaches the yield again. What should happen then? Should it suspend so that a subsequent call to f.next() can receive another value? Or should reaching yield after the generator is closed raise another exception? I'm leaning towards the latter, despite the fact that it will cause an infinite loop in this case -- that's no different when you have a print statement instead of a yield statement. (Mentally substituting a print for a yield is often a useful way to think about a generator. The reverse can also be useful to consider converting a non-generator to a generator: if it prints a sequence of values, it can also yield the same sequence.) Another comment on Samuele's PEP: It is sort of sad that the *user* of a generator has to know that the generator's close() must be called. Normally, the beauty of using a try/finally for cleanup is that your callers don't need to know about it. But I see no way around this. And this is still an argument that pleads against the whole thing, either PEP 288 or Samuele's smaller variant: the usual near-guarantee that code in a finally clause will be executed no matter what (barring fatal errors, os._exit() or os.execv()) does not apply. And this was the original argument against allowing yield inside try/finally. But the need for cleanup is also clear, so I like Samuele's KISS compromise. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 13:50 26.08.2003 -0700, Guido van Rossum wrote:
also try: ... yield ... except: # or except Exception raise a new different exception is not a clear-cut situation. close should probably propagate exceptions different from CloseGenerator but here the right thing to do is fuzzy. It also another case of the problems of too broad-catching except clauses, in some sense the exception generated by close is like an async exception like KeyboardInterrupt.
yes, we cannot totally win.

The problem is that the language definition doesn't define *when* GC happens, if at all. E.g. in Jython this depends on the Java GC. And even in CPython, it's easily conceivable that an accidental cycle in the user data structures prevents collection. So there is a need for explicit control in some cases.
Back to lurking,
Good idea. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

Hello, (My reply didn't seem to have reached python-dev... second try.) On Tue, Aug 26, 2003 at 01:50:52PM -0700, Guido van Rossum wrote:
What about letting the 'for' handle this? It's the most common way generators are used. When a 'for' loop on a generator-iterator finishes it would call the close() method of the iterator, which on generators would (say) simulate a return from the latest 'yield'. Well, close() might not be such a good name because it would probably break exsiting code (e.g. closing files unexpectedly), but __exit__() might do. In other words we could import some of the proposed functionality of the 'with' keyword (PEP 310) into 'for'. I think it makes sense because 'for' is already defined in term of implicit calls to next(), the only method of iterators; so if a second method is added, 'for' can be taught about it too. Armin

At 14:46 13.09.2003 +0100, Armin Rigo wrote:
I expect generators to grow also a __exit__ if they grow a close and PEP 310 is accepted, this idiom will then be common with g = gen() for v in g: ... but it will be common for files and file-like too. It can be reasonable to discuss whether we want a special syntax then that conflates 'with' and 'for' behavior. But we can experiment a while before seeing whether writing the idiom is unbearably tedious ;). OTOH conflating 'with' and 'for' just for generators seems a rather ad-hoc breaking of orthoganility of the two, you could not write anymore code like this: g = gen() for v in g: ... do something up to a point ... ... for v in g: ... now this is rare but still breaking orthoganility of primitives is something I would think twice about. regards.

Hello Samuele, On Sat, Sep 13, 2003 at 04:18:51PM +0200, Samuele Pedroni wrote:
I had thought about this. This occurs when you 'break' out of the first loop. I would say that NOT calling the __exit__() method in this specific case seems quite intuitive, the 'break' meaning 'just exit from the loop now without any further processing, skipping the 'else' part if present'. Armin

At 11:48 AM 9/15/03 +0100, Armin Rigo wrote:
Hmmm... You realize this is also going to break the similarity between: i = iter(something) while 1: try: j=i.next() except StopIteration: break and for j in iter(something): pass The former is already complex enough as a mental model. I think it gets altogether too complex if one also has to consider the enter/leave issue. This strikes me as an area like try/except vs. try/finally: it really should be a separate block, just for the sake of explicitness. As much as it would be cool to have the automatic release, I think I'd rather use: with i = iter(something): for j in i: ... And make the resource management very visible.

Hello Phillip, On Mon, Sep 15, 2003 at 11:26:27AM -0400, Phillip J. Eby wrote:
Makes sense. Then let's make sure, if both 'with' and 'yield within try:finally' are accepted, that they are compatible, e.g. by having and __exit__() method on generators (and not just a close()). Armin

this was written before reading Guido's last comments. At 16:17 26.08.2003 -0400, Raymond Hettinger wrote:
1) I think we want to enable try-finally, because is the typical way to spell resource release: f = file(...,"r") try: ... except GeneratorClose: f.close() return else: f.close() or f = file(...,"r") fastexit = 0 try: ... except GeneratorClose: fastexit = 0 f.close() if fastexit: return 2) even if we had gen.throw(...), I think it would be better to have explicitly gen.close(), it expresses intention better IMO and feels like file.close() etc... 3) for the purpose of close, it seems that forced-return vs. throwing an exception on generator side, can have more clearly definable semantics, although it has some limitations too. so the question is whether there are use cases for the more general gen.throw(...) different from gen.close() purpose, and if we have it whether we can layer a gen.close() with the proper semantics on top of it, i.e. we should then clearly think about the issues for exceptions-semantics for gen.close(). gen.throw is a bigger gun but they don't kill one another regards.
participants (7)
-
Armin Rigo
-
Guido van Rossum
-
Neil Schemenauer
-
Phillip J. Eby
-
Raymond Hettinger
-
Ronald Oussoren
-
Samuele Pedroni