[Python-Dev] pre-PEP: Resource-Release Support for Generators

Samuele Pedroni pedronis at bluewin.ch
Tue Aug 26 16:36:46 EDT 2003


this is something we discussed with Guido, and also Moshe Zadka at Europython.

Guido thought it seems reasonable enough, if the details can be nailed.

I have written it down so the idea doesn't get lost, for the moment is more a
matter of whether it can get a number, and then it can go dormant for a while.

- * -

PEP: XXX
Title: Resource-Release Support for Generators
Version: $Revision$
Last-Modified: $Date$
Author: Samuele Pedroni <pedronis at python.org>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 25-Aug-2003
Python-Version: 2.4
Post-History:


Abstract

     Generators allow for natural coding and abstraction of traversal
     over data.  Currently if external resources needing proper timely
     release are involved, generators are unfortunately not adequate.
     The typical idiom for timely release is not supported, a yield
     statement is not allowed in the try clause of a try-finally
     statement inside a generator.  The finally clause execution cannot
     be either guaranteed or enforced.

     This PEP proposes that generators support a close method and
     destruction semantics, such that the restriction can be lifted,
     expanding the applicability of generators.


Rationale

     Python generators allow for natural coding of many data traversal
     scenarios Their instantiation produces iterators, i.e. first-class
     objects abstracting traversal (with all the advantages of first-
     classness).  In this respect they match in power and offer some
     advantages over the approach using iterator methods taking a
     (smalltalkish) block.  On the other hand, given current limitations
     (no yield allowed in a try clause of a try-finally inside a
     generator) the latter approach seems better suited at encapsulating
     not only traversal but also exception handling and proper resource
     acquisition and release.

     Let's consider an example (for simplicity, files in read-mode are
     used):

         def all_lines(index_path):
             for path in file(index_path,"r"):
                 for line in file(path.strip(),"r"):
                     yield line

     this is short and to the point, but the try-finally for timely
     closing of the files cannot be added.  (While instead of a path,
     a file, whose closing then would be responsibility of the caller,
     could be passed in as argument, the same is not applicable for the
     files opened depending on the contents of the index).

     If we want timely release, we have to sacrifice the simplicity and
     directness of the generator-only approach: (e.g.)

         class AllLines:
             def __init__(self,index_path):
                 self.index_path = index_path
                 self.index = None
                 self.document = None

            def __iter__(self):
                self.index = file(self.index_path,"r")
                for path in self.index:
                    self.document = file(path.strip(),"r")
                    for line in self.document:
                        yield line
                    self.document.close()
                    self.document = None

           def close(self):
               if self.index:
                  self.index.close()
               if self.document:
                  self.document.close()

     to be used as:

         all_lines = AllLines("index.txt")
         try:
             for line in all_lines:
                 ...
         finally:
             all_lines.close()

     The more convoluted solution implementing timely release, seems
     to offer a precious hint.  What we have done is encapsulating our
     traversal in an object (iterator) with a close method.

     This PEP proposes that generators should grow such a close method
     with such semantics that the example could be rewritten as:

         def all_lines(index_path):
             index = file(index_path,"r")
             try:
                 for path in file(index_path,"r"):
                     document = file(path.strip(),"r")
                     try:
                         for line in document:
                             yield line
                     finally:
                        document.close()
             finally:
                 index.close()


         all = all_lines("index.txt")
         try:
             for line in all:
                 ...
         finally:
             all.close()

      PEP 255 [1] disallows yield inside a try clause of a try-finally
      statement, because the execution of the finally clause cannot be
      guaranteed as required by try-finally semantics.  The semantics of
      the proposed close method should be such, that while the finally
      clause execution still cannot be guaranteed, it can be enforced
      when required.  The semantics of generator destruction on the
      other hand should be extended in order to implement a best-effort
      policy for the general case.  This strikes as a reasonable
      compromise, the resulting global behavior being similar to that of
      files and closing.


Possible Semantics

     A close() method should be implemented for generator objects.

     1) If a generator is already terminated, close should be a no-op.

     Otherwise: (two alternative solutions)

     (Return Semantics) The generator should be resumed, generator
     execution should continue as if the instruction at re-entry point
     is a return.  Consequently finally clauses surrounding the re-entry
     point would be executed, in the case of a then allowed
     try-yield-finally pattern.

     Issues: is it important to be able to distinguish forced
     termination by close, normal termination, exception propagation
     from generator or generator-called code?  In the normal case it
     seems not, finally clauses should be there to work the same in
     all these cases, still this semantics could make such a distinction
     hard.

     Except-clauses, like by a normal return, are not executed, such
     clauses in legacy generators expect to be executed for exceptions
     raised by the generator or by code called from it.  Not executing
     them in the close case seems correct.

     OR (Exception Semantics) The generator should be resumed and
     execution should continue as if a special-purpose exception
     (e.g. CloseGenerator) has been raised at re-entry point.  Close
     implementation should consume and not propagate further
     this exception.

     Issues: should StopIteration be reused for this purpose? Probably
     not.  We would like close to be a harmless operation for legacy
     generators, which could contain code catching StopIteration to deal
     with other generators/iterators.

     In general, with exception semantics, it is unclear what to do
     if the generator does not terminate or we do not receive the
     special exception propagated back.  Other different exceptions
     should probably be propagated, but consider this possible legacy
     generator code:

         try:
             ...
             yield ...
             ...
         except: # or except Exception:, etc
             raise Exception("boom")

     If close is invoked with the generator suspended after the
     yield, the except clause would catch our special purpose
     exception, so we would get a different exception propagated
     back, which in this case ought to be reasonably consumed
     and ignored but in general should be propagated, but separating
     these scenarios seem hard.

     The exception approach has the advantage to let the generator
     distinguish between termination cases and have more control.
     On the other hand clear-cut semantics seem harder to define.

     2) Generator destruction should invoke close method behavior.


Remarks

     If this proposal is accepted, it should become common practice
     to document whether a generator acquires resources, so that its
     close method ought to be called.  If a generator is no longer
     used, calling close should be harmless.

     On the other hand, in the typical scenario the code that
     instantiated the generator should call close if required by it,
     generic code dealing with iterators/generators instantiated
     elsewhere should typically not be littered with close calls.

     The rare case of code that has acquired ownership of and need to
     properly deal with all of iterators, generators and generators
     acquiring resources that need timely release, is easily solved:

         if hasattr(iterator,'close'):
             iterator.close()


Open Issues

     Definitive semantics ought to be chosen, implementation issues
     should be explored.


Alternative Ideas

     The idea that the yield placement limitation should be removed
     and that generator destruction should trigger execution of finally
     clauses has been proposed more than once.  Alone it cannot
     guarantee that timely release of resources acquired by a generator
     can be enforced.

     PEP 288 [2] proposes a more general solution, allowing custom
     exception passing to generators.


References

     [1] PEP 255 Simple Generators
         http://www.python.org/peps/pep-0255.html

     [2] PEP 288 Generators Attributes and Exceptions
         http://www.python.org/peps/pep-0288.html


Copyright

     This document has been placed in the public domain.




Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:




More information about the Python-Dev mailing list