[Python-Dev] finalization again

Tim Peters tim_one@email.msn.com
Sat, 11 Mar 2000 15:10:23 -0500


[Barry A. Warsaw, jamming after hours]
> ...
> What if you timestamp instances when you create them?  Then when you
> have trash cycles with finalizers, you sort them and finalize in
> chronological order.

Well, I strongly agree that would be better than finalizing them in
increasing order of storage address <wink>.

> ...
> - FIFO order /seems/ more natural to me than FILO,

Forget cycles for a moment, and consider just programs that manipulate
*immutable* containers (the simplest kind to think about):  at the time you
create an immutable container, everything *contained* must already be in
existence, so every pointer goes from a newer object (container) to an older
one (containee).  This is the "deep" reason for why, e.g., you can't build a
cycle out of pure tuples in Python (if every pointer goes new->old, you
can't get a loop, else each node in the loop would be (transitively) older
than itself!).

Then, since a finalizer can see objects pointed *to*, a finalizer can see
only older objects.  Since it's desirable that a finalizer see only wholly
intact (unfinalized) objects, it is in fact the oldest object ("first in")
that needs to be cleaned up last ("last out").  So, under the assumption of
immutability, FILO is sufficient, but FIFO dangerous.  So your muse inflamed
you with an interesting tune, but you fingered the riff backwards <wink>.

One problem is that it all goes out the window as soon as mutation is
allowed.  It's *still* desirable that a finalizer see only unfinalized
objects, but in the presence of mutation that no longer bears any
relationship to relative creation time.

Another problem is in Guido's directory example, which we can twist to view
as an "immutable container" problem that builds its image of the directory
bottom-up, and where a finalizer on each node tries to remove the file (or
delete the directory, whichever the node represents).  In this case the
physical remove/delete/unlink operations have to follow a *postorder*
traversal of the container tree, so that "finalizer sees only unfinalized
objects" is the opposite of what the app needs!

The lesson to take from that is that the implementation can't possibly guess
what ordering an app may need in a fancy finalizer.  At best it can promise
to follow a "natural" ordering based on the points-to relationship, and
while "finalizer sees only unfinalized objects" is at least clear, it's
quite possibly unhelpful (in Guido's particular case, it *can* be exploited,
though, by adding a postorder remove/delete/unlink method to nodes, and
explicitly calling it from __del__ -- "the rules" guarantee that the root of
the tree will get finalized first, and the code can rely on that in its own
explicit postorder traversal).

>   but then I rarely create cyclic objects, and almost never use __del__,
>   so this whole argument has been somewhat academic to me :).

Well, not a one of us creates cycles often in CPython today, simply because
we don't want to track down leaks <0.5 wink>.  It seems that nobody here
uses __del__ much, either; indeed, my primary use of __del__ is simply to
call an explicit break_cycles() function from the header node of a graph!
The need for that goes away as soon as Python reclaims cycles by itself, and
I may never use __del__ at all then in the vast bulk of my code.

It's because we've seen no evidence here (and also that I've seen none
elsewhere either) that *anyone* is keen on mixing cycles with finalizers
that I've been so persistent in saying "screw it -- let it leak, but let the
user get at it if they insist on doing it".  Seems we're trying to provide
slick support for something nobody wants to do.  If it happens by accident
anyway, well, people sometimes divide by 0 by accident too <0.0 wink>:  give
them a way to know about it, but don't move heaven & earth trying to treat
it like a normal case.

if-it-were-easy-to-implement-i-wouldn't-care-ly y'rs  - tim