[Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

Fri Oct 21 03:09:21 EDT 2016

Hi all,

It's an old feature of the weakref API that you can define an
arbitrary callback to be invoked when the referenced object dies, and
that when this callback is invoked, it gets handed the weakref wrapper
object -- BUT, only after it's been cleared, so that the callback
can't access the originally referenced object. (I.e., this callback
will never raise: def callback(ref): assert ref() is None.)

AFAICT the original motivation for this seems was that if the weakref
callback could get at the object, then the weakref callback would
effectively be another finalizer like __del__, and finalizers and
reference cycles don't mix, so weakref callbacks can't be finalizers.
There's a long document from the 2.4 days about all the terrible
things that could happen if arbitrary code like callbacks could get
unfettered access to cyclic isolates at weakref cleanup time [1].

But that was 2.4. In the mean time, of course, PEP 442 fixed it so
that finalizers and weakrefs mix just fine. In fact, weakref callbacks
are now run *before* __del__ methods [2], so clearly it's now okay for
arbitrary code to touch the objects during that phase of the GC -- at
least in principle.

So what I'm wondering is, would anything terrible happen if we started
passing still-live weakrefs into weakref callbacks, and then clearing
them afterwards? (i.e. making step 1 of the PEP 442 cleanup order be
"run callbacks and then clear weakrefs", instead of the current "clear
weakrefs and then run callbacks"). I skimmed through the PEP 442
discussion, and AFAICT the rationale for keeping the old weakref
behavior was just that no-one could be bothered to mess with it [3].

[The motivation for my question is partly curiosity, and partly that
in the discussion about how to handle GC for async objects, it
occurred to me that it might be very nice if arbitrary classes that
needed access to the event loop during cleanup could do something like

  def __init__(self, ...):
      loop = asyncio.get_event_loop()
      loop.gc_register(self)

  # automatically called by the loop when I am GC'ed; async equivalent
of __del__
  async def aclose(self):
      ...

Right now something *sort* of like this is possible but it requires a
much more cumbersome API, where every class would have to implement
logic to fetch a cleanup callback from the loop, store it, and then
call it from its __del__ method -- like how PEP 525 does it. Delaying
weakref clearing would make this simpler API possible.]

-n

[1] https://github.com/python/cpython/blob/master/Modules/gc_weakref.txt
[2] https://www.python.org/dev/peps/pep-0442/#id7
[3] https://mail.python.org/pipermail/python-dev/2013-May/126592.html

-- 
Nathaniel J. Smith -- https://vorpus.org