[Python-Dev] Garbage collecting closures

Guido van Rossum guido@python.org
Mon, 14 Apr 2003 20:55:21 -0400


> > Finalizers seem useful in general, but I would still worry about any
> > specific program that managed critical resources using finalizers.
> 
> What *are* they useful for, then? Or are they only useful "in
> general", and never in any particular case? :-)

Finalizers are a necessary evil.  For example, when I create a Python
file type that encapsulates an external resource like a file
descriptor as returned by os.open(), together with a buffer, I really
want to be able to specify a finalizer that flushes the write buffer
and closes the file descriptor.  But I also really want the
application not to rely on that finalizer!

Note that as a library developer, I can write the file type careful to
avoid being part of any cycles, so the restriction on finalizers that
are part of cycles doesn't bother me too much: I'm doing all I can,
and if a file is nevertheless kept alive by a cycle in the
application's code, the application has to deal with this (same as
with a file type implemented in C, for which the restriction on
finalizers in cycles doesn't hold).

Why do I, as library developer, want the finalizer?  Because I don't
want to rely on the application to keep track of when a file must be
closed.

But then why do I (still as library developer) recommend that the
application closes files explicitly?  Because there's no guarantee
*when* finalizers are run, and it's easy for the application to create
a cycle unknowingly (as we've seen in Paul's case).

Basically, the dual requirement is there to avoid the application and
the library to pointing fingers at each other when there's a problem
with leaking file descriptors.

This makes me think that Python should run the garbage collector
before exiting, so that finalizers on objects that were previously
kept alive by cycles are called (even if finalizers on objects that
are *part* of a cycle still won't be called).

I also think that if a strongly connected component (a stronger
concept than cycle) has exactly one object with a finalizer in it,
that finalizer should be called, and then the object should somehow be
marked as having been finalized (maybe a separate GC queue could be
used for this) in case it is resurrected by its finalizer.

--Guido van Rossum (home page: http://www.python.org/~guido/)