[Python-Dev] Making weakref callbacks safe in cyclic gc

Mon Nov 17 14:20:30 EST 2003

[Tim]
>> If the callback itself is part of the garbage getting collected,
>> then the weakref holding the callback must also be part of the
>> garbage getting collected (else the weakref holding the callback
>> would act as an external root, preventing the callback from being
>> part of the garbage being collected too).
>>
>> My thought then was that a simpler scheme could simply call tp_clear
>> on the trash weakrefs first.  Calling tp_clear on a weakref just
>> throws away the associated callbacks (if any) unexecuted, and if
>> they don't get run then we have no reason to care what's reachable
>> from them anymore.

[Neil Schemenauer]
> This I don't get.  Don't people want the callbacks to be called?

The one person I know who cares about this a lot is Jim, and he was happy to
have his callbacks raise mystery exceptions, just not segfaults <wink>.  But
if he doesn't care whether his callbacks "do something" in this context, he
can't care whether they don't get run at all in this context either.

When a weakref goes away, its callback (if any) goes away too, unexecuted,
cyclic gc or not.  If the weakref is part of cyclic trash, then clearing it
up first is defensible -- that may have happened in 2.3.2 already, as the
order in which gc invokes tp_clear is mostly accidental.  If I can force the
order in such a way as to reliably prevent disasters, that's a good
tradeoff.  If the user doesn't want the possibility for weakref callbacks
not to get invoked, then they have to ensure that the weakref itself
outlives the object whose death triggers that weakref's callback.  They have
to do that today too, with or without cyclic gc:

>>> def cb(ignore): return 1/0
...
>>> import weakref
>>> class C: pass
...
>>> c = C()
>>> wr = weakref.ref(c, cb)
>>> del wr
>>> del c
>>>

Once the weakref is cleared, the callback is history.  When a weakref is
part of a trash cycle, may as well clear it first.

> I don't see how a weakref callback is different than a __del__
> method.  While the object is not always reachable from the callback
> it could be (e.g. the callback could be a method).  The fact that
> callbacks are one shot doesn't seem to help either since the
> callback can create a new callback.

It's the one-shot business that (I think) makes them easier to live with, in
conjunction with that a callback vanishes if the weakref holding it goes
away.  A __del__ method never goes away.  While a callback *can* install new
callbacks, all over the place, I don't expect that real code does that.  For
code that doesn't, gc can make good progress.

Java's flavor of __del__ method executes at most once:  if an object is
resurrected by its finalizer, that object's finalizer will never be run
again (unless invoked explicitly by the user).  That allows Java's gc to
make good progress in the presence of resurrecting finalizers too:
finalizers (if any) in cycles are run in an arbitrary order, and if any were
run gc has to give up on finishing tearing down the objects (it can't know
whether finalizers have resurrected objects until gc runs again).  In the
absence of resurrection, though, the next time gc runs, all the objects it
ran finalizers on before are almost certainly still trash, and it can
reclaim the memory without running dangerous finalizers again first.  The
patch I posted for weakrefs took a similar approach.

Java doesn't allow adding callbacks to its elaborate weakrefs, though.  It's
more like the way we treat gc.garbage:  you can optionally specify a
ReferenceQueue object with a Java weakref, and when the referenced object is
dead the weakref is added to the queue, for user inspection (well, I guess
it's a little different for Java's "phantom references", but who cares ...).

So I've been moving to a scheme where we treat finalizers like Java treats
weakrefs, and we treat weakref callbacks like Java treats finalizers <wink>.

The Java weakref facilities would be a lot easier for gc to live with, but
too late for that.

Jim empathically doesn't want to poll gc.garbage looking for weakrefs that
appear in cycles.  Maybe "tough luck" is the best response we can come up
with to that, but cycles are getting very easy to create in Python by
accident, so I don't really want to settle for that.  OTOH, people can write
__del__ methods that don't provoke leaks, and I suspect they could learn how
to write weakrefs that don't provoke leaks too (assuming we changed Python
to treat "has a weakref callback" the same as "has a __del__ method").  One
way to do that was mentioned above, ensuring that a weakref outlives the
object whose death triggers the weakref's callback.  Or ensuring the
reverse.  It's only letting them die "at the same time" in a trash cycle
that creates trouble.

If the weakref and that object are both in the same clump of cyclic trash,
it's unpredictable what happens in 2.3.2.  If the weakref suffers tp_clear()
first, the callback won't get invoked; if the object suffers tp_clear()
first, the callback will get invoked -- but may lead to segfaults or lesser
surprises.

We can certainly repair that by treating objects with callbacks the same as
objects with __del__ methods when they're in cyclic trash, and that's an
easy change to the implementation.  Then the objects with callbacks, and
everything reachable from them, leak unless/until the user snaps enough
cycles in gc.garbage.

I don't have a feel for how much trouble it would be to avoid running afoul
of that.  Jim has so far presented it as an unacceptable burden.

Another scheme is to just run all the weakref callbacks associated with
trash cycles, without tp_clear'ing anything first.  Then run gc again to
figure out what's still trash, and repeat until no more weakref callbacks in
trash cycles exist.  If the weakref implementation is changed to forbid
creating a new weakref callback while a weakref callback is executing, that
gc-loop must eventually terminate (after the first try even in most code
that does manage to put weakref callbacks in trash cycles).

Beats me ...