[Python-Dev] weakref gc semantics

Tim Peters tim.peters at gmail.com
Wed Nov 3 05:40:05 CET 2004


This example bothers me a little, under current Python CVS:

"""
import gc, weakref

def C_gone(ignored):
    print "An object of type C went away."

class C:
    def __init__(self):
        self.wr = weakref.ref(self, C_gone)

print "creating non-cyclic C instance"
c = C()

print "creating cyclic C instance and destroying old one"
c = C()  # triggers the print
c.loop = c

print "getting rid of the cyclic one"
c = None  # doesn't trigger the print
gc.collect()  # ditto
 """

Output:

    creating non-cyclic C instance
    creating cyclic C instance and destroying old one
    An object of type C went away.
    getting rid of the cyclic one

An instance c of C gets a strong reference (self.wr) to a weak
reference W to itself.  At the end, c is in a cycle, but W isn't.  W
ends up reachable only *from* stuff in a cycle (namely c).

gc suppresses the callback then because W is unreachable, not really
because W is in a cycle.  If self.wr had a __del__ method instead, it
would have been executed.  So it bothers me a little that the callback
isn't:  a reachable callback can have visible effects, and there
really isn't ambiguity about which of c and W "dies first" (c does) in
the example.

The question is whether gc should really be looking at whether the
weakref *callback* is reachable, regardless of whether the weakref
itself is reachable (if the weakref is reachable, its callback is too,
and 2.4b2 invokes it -- the only case in question is the one in the
example, where the weakref and the weakref's referent are unreachable
but the weakref's callback is reachable).

Pythons 2.2.3 and 2.3.4 produce the same output, so no way do I want
to change Python 2.4 at this late stage.  It's a question for 2.5.  It
gets more complicated than the above, of course.  For example, we can
end up with weakrefs *in* cycles, with reachable callbacks, and then
there's no good way to decide which callback to invoke first.  That's
the same ordering problem we have when objects with __del__ methods
are in trash cycles.  It's not intractable, though, because gc invokes
weakref callbacks before tp_clear has been invoked on anything, and
reachable weakref callbacks can't resurrect any trash (anything they
can access is, by definition of "reachable", not trash).

A hidden agenda is that I like weakref callbacks a lot more than
__del__ methods, and would like to see them strong enough so that
Python 3 could leave __del__ out.  Planting a weakref to self *in*
self (as in the example at the top) is one way to approach
critical-resource cleanup in a __del__-free world (a weakref subclass
could remember the critical resource when an instance was constructed,
and the callback could access the critical resource to clean it up --
but, as the example shows, that doesn't work now if the containing
instance ends up in cyclic trash; but there are other ways that do
work).


More information about the Python-Dev mailing list