[Python-Dev] Making weakref callbacks safe in cyclic gc

Mon Nov 17 11:06:57 EST 2003

[Tim, on <http://www.python.org/sf/843455>]
> ...
> It's exactly the scheme I described, and the coding went smoothly
> because it was something that could be (and was) fully thought-out in
> advance.  That doesn't rule out conceptual or coding errors, though.

As I noted on the patch in the wee hours, "conceptual errors" wins.  I
out-thought a wrong thing, but one that happened to be good enough to fix
all the new test cases:  it doesn't really matter which objects are
reachable from the objects whose deaths trigger callbacks, what really
matters is which objects are reachable from the callbacks themselves.  The
test cases were so incestuous (objects all pointing to each other) that
those turned out to be the same sets, but that's not a necessary outcome --
although it appears to be a likely outcome.

Here's one that's surprising after the patch:

"""
import weakref, gc

class C:
    def cb(self, ignore):
        print self.__dict__

c1, c2 = C(), C()

c2.me = c2
c2.c1 = c1
c2.wr = weakref.ref(c1, c2.cb)

del c1, c2
print 'about to collect'
gc.collect()
print 'collected'
"""

The callback triggers on the death of c1 then, but c1 isn't in a cycle at
all (it's hanging *off* a cycle), and c2 isn't reachable from c1.  But c2 is
reachable from the callback.

c2 is in a self-cycle via c2.me, and in another via c2.wr (which indirectly
points back to c2 via the weakref's bound method object c2.cb).

After the patch, c1 ends up in the set of objects with an associated weakref
callback, but c2 isn't reachable from that set so tp_clear is called on c2.
That destroys c2's __dict__ before the callback can get invoked, so when c1
dies the callback sees a tp_clear'ed c2:

    about to collect
    {}
    collected

I know it's hard for people to get excited about an empty dict <wink>.  But
that's not the point:  the point is that if it's possible to expose an
object that's been tp_clear'ed to Python code, then *anything* can happen.
For example, this minor variation segfaults after the patch, right after
printing "about to collect":

"""
import weakref, gc

class C(object):
    def cb(self, ignore):
        print self.__dict__

class D:
    pass

c1, c2 = D(), C()

c2.me = c2
c2.c1 = c1
c2.wr = weakref.ref(c1, c2.cb)

del c1, c2, C, D
print 'about to collect'
gc.collect()
print 'collected'
"""

That class C was reachable from c1 in the first example protected C from
getting tp_clear'ed at all, which was something the patch was trying to
accomplish.  But by giving c1 a different class, C's tp_clear immunity went
away, but C is still reachable from the callback.  Boom.

So what's reachable from a callback?  If the callback is not *itself* part
of the garbage getting collected, then it acts like an external root, and so
nothing reachable from the callback is part of the garbage getting collected
either.  gc has no worries then.

If the callback itself is part of the garbage getting collected, then the
weakref holding the callback must also be part of the garbage getting
collected (else the weakref holding the callback would act as an external
root, preventing the callback from being part of the garbage being collected
too).

My thought then was that a simpler scheme could simply call tp_clear on the
trash weakrefs first.  Calling tp_clear on a weakref just throws away the
associated callbacks (if any) unexecuted, and if they don't get run then we
have no reason to care what's reachable from them anymore.

The fly in that ointment appears to be that a callback can itself be the
target of a weakref, so that when the callback is thrown away, it can
trigger calling another callback.  At that point I feel asleep muttering
unspeakable oaths.