[Python-Dev] Making weakref callbacks safe in cyclic gc

Fri Nov 14 23:17:45 EST 2003

I think I have a reasonably elegant scheme to nail this.  It's best
described as modifications to what cyclic gc already does.  So here's a
summary, with the new steps identified by [NEW]:

Cyclic gc first finds the maximal subset S of the objects in the current
generation such that no object in S is reachable from outside of S.  S is
the (possibly empty) set of cyclic trash in the current generation.

Next partition S into 5 ([NEW] -- is 3 today) disjoint sets:

1. Objects with __del__ methods.

2. Objects not in #1 reachable from an object in #1.

3. [NEW] Objects not in #1 or #2 with an associated weakref callback.

4. [NEW] Objects not in #1, #2 or #3 reachable from an object in #3.

5. Objects not in one of the other sets.

Then:

A. Call tp_clear on each object in set 5 (set 5 may mutate while
   this is going on, so that needs some care).  If an object's
   refcount isn't 0 after calling its tp_clear, move it to the
   next older generation (that doesn't preclude that a later tp_clear
   in this step may reclaim it).

B. [NEW] Invoke the callbacks associated with the objects still in
   set 3.  This also needs some care, as the deallocations occurring
   in step #A may remove objects from set 3, or even just remove
   the weak references to them so that the objects in set 3 are
   still there, but no longer have an associated callback.  I expect
   we'd have to contrive code to make that happen, but we have to
   be safe against every possibility.  The callbacks invoked during
   this step may also remove callbacks from objects in set 3 we
   haven't yet gotten to, or even add new callbacks to objects in
   sets 1 through 4.

C. [NEW] Move the objects still remaining in sets 3 and 4 to the
   youngest generation.

D. Move the objects still remaining in set 1 to gc.garbage.

E. Move the objects still remaining in set 2 to the next (older)
   generation.

That's telegraphic, and is bursting with subtleties.  Here are notes on the
new subtleties:

+ A key observation is that running weakref callbacks on the objects
  in set 3 can't have any effect on the objects in set 5, nor can the
  states of the objects in set 5 affect what a callback may want to do.
  This is so because no object in set 5 is reachable from an object
  in set 3:  a callback can neither consult nor alter a set 5 object.
  So clearing set 5 first (in step A) is harmless, and should allow
  most cyclic trash in most programs to get collected ASAP.

+ Clearing the objects in set 5 first is desirable also because
  doing so may break enough links that objects in sets 1 thru 4
  get deallocated naturally (meaning via the usual refcount-falls-to-0
  route).  Note that it's quite possible that objects in sets 1 thru
  4 are reachable from objects in set 5 -- it's the other direction
  where reachability can't hold (by construction of the partition,
  not by luck).

+ By the start of B, tp_clear hasn't been called on anything reachable
  from sets 3 or 4, so the callbacks "see" wholly intact objects.
  Nothing visible to the callbacks has been torn down:  __dicts__
  are still fully populated, __mro__ slots are still as they were,
  etc.  Step B doesn't do any tp_clear itself either, so the only
  mutations that occur are those performed by the callbacks.  If
  a callback destroys a piece of state some other callback wanted,
  that's entirely on the user's head.

+ Because a weakref callback destroys itself after it's called, in
  non-pathological programs no object in set 3 or 4 will have a
  weakref callback associated with it at the end of step B.  We
  cannot go on to call tp_clear on these objects, because the instant
  the first callback returns, we have no idea anymore which of these
  objects are still part of cyclic trash (the callbacks can resurrect
  any or all of them, ditto add new callbacks to any/all).

  Determining whether they are still trash requires doing live/dead
  analysis over from scratch.  Simply moving them into *some*
  generation ensures that they'll get analyzed again on a future run
  of cyclic gc.  Moving them into the youngest generation is done
  because they almost certainly are (in almost all programs, almost
  all of the time) still cyclic trash, and without new weakref
  callbacks.  Putting them in the youngest generation allows them
  to get reclaimed on the next gc invocation.

  In steady state for a sane program creating a sustained stream of
  cyclic trash with associated weakref callbacks, this delays their
  collection by one gc invocation:  the reclamation throughput should
  equal the rate of trash creation, but there's a one-invocation
  reclamation latency introduced at the start.  There's no new latency
  in invoking the callbacks.

+ Because we still won't collect cyclic trash with __del__ methods,
  or cyclic trash reachable from such trash, we do the partitioning
  in such a way that weakref callbacks on such trash don't get called
  at all -- we're not even going to try to reclaim them, so it may be
  surprising if their callbacks get invoked.  OTOH, it may be desired
  that their callbacks get invoked despite that gc will never try
  to reclaim them on its own.  Tough luck.  The callbacks will
  get invoked if and when the user breaks enough cycles in gc.garbage
  to avoid running afoul of the __del__ restriction.

Objections?  Great <wink> objections are of two kinds:  (1) it won't work;
and (2) it can't be sold for a bugfix release.  Note that 2.3.2 is
segfaulting today, so *something* has to be done for a bugfix release.  I
don't believe this scheme alters any defined semantics, and to the contrary
makes it possible to say for the first time that objects visible to
callbacks are never in mysteriously (and undefinedly so) partly-destroyed
states.  Objecting that the order of callback invocation isn't defined
doesn't hold, because the order isn't defined in 2.3.2 either.  Tempting as
it may be, a scheme that refused to collect cyclic trash with associated
weakref callbacks would be an incompatible change; Jim also has a use case
for that (a billion lines of Zope3 <wink>).