[Python-Dev] Re: More fun with Python shutdown

Tue Nov 11 15:41:11 EST 2003

[Phillip J. Eby]
> ...
> Actually, the funny thing here is that it's unlikely that the cycle a
> type is in involves its base classes.

Well, all new-style classes are in cycles with bases:

>>> class C(object): pass
...
>>> object.__subclasses__()[-1]  # so C is reachable from object
<class '__main__.C'>
>>> C.__mro__                    # and object is reachable from C
(<class '__main__.C'>, <type 'object'>)
>>>

For that matter, since the first element of the MRO is the class itself, a
new-style class is in a self-cycle.  That also requires clearing the MRO to
break.

IIRC, one of the reasons Guido wanted to call gc during finalization was to
give these new-style class cycles a chance to destroy themselves cleanly.

> ...
> What's baffling me is what code is accessing the class after tp_clear
> is called.  It can't be a __del__ method, or the cycle collector
> wouldn't be calling tp_clear, right?  Or does it run __del__ methods
> during shutdown?

Jim explained -- as best we can without a finite test case to nail it.

There does seem to be an assumption that a class object won't get collected
if any instance of the class is still around.  "Because" the class object
would have a reference to it from the class instance, so that a live class
instance keeps the class alive.  But, if the class object and all remaining
instances are all in one cycle, and that cycle is unreachable from outside,
and the class doesn't define a __del__ method, then I *expect* gc would try
to clean up the dead cycle.  In that case, gc starts calling tp_clear slots
in a seemingly arbitrary order.  If the destruction of a class instance then
happened to trigger a weakref callback which in turn tried to access an
attribute of the class, and the class had already been through its tp_clear,
then a NULL-pointer dereference (due to the cleared tp_mro slot) would be
unavoidable.

But if that's what's happening, then tricks like the one on the table may
not be enough to stop segfaults:  replacing tp_mro with an empty tuple only
"works" so long as the class object hasn't also been thru its tp_dealloc
routine.  Once it goes thru tp_dealloc, the memory is recyclable heap trash,
and tp_mro may or may not retain the bits that "look like" a pointer to an
empty tuple by the time some weakref callback triggers an access to them.
In a release build it's likely that the "pointer to an empty tuple" will
survive across deallocation for at least a little while, because tp_mro
isn't near an end of the object (so is unlikely to get overridden by
malloc's or pymalloc's internal bookkeeping pointers).  It's a crapshoot,
though.

A complication in all this is that Python's cyclic gc never calls tp_dealloc
or tp_free directly!  The only cleanup slot it calls directly is tp_clear.
Deallocations still occur only as side effects of refcounts falling to 0, as
tp_clear actions break cycles (and execute Py_DECREFs along the way).

This protects against a class's tp_dealloc (but not tp_clear) getting called
while instances still exist, even if they're all in one cycle.  But "still
exist" gets fuzzy then.

Here's a cute one:

"""
class C(object):
    pass

def pp():
    import winsound
    winsound.Beep(2000, 500)

import weakref
wr = weakref.ref(C, lambda ignore, pp=pp: pp())
del C  # this isn't enough to free C:  C is still in at least two cycles
"""

C:\Python23>python temp5.py
Fatal Python error: Interpreter not initialized (version mismatch?)

abnormal program termination

C:\Python23>

That one is due to the weakref callback getting called after Py_Finalize
does

	initialized = 0;

so that the "import winsound" fails (I gave up trying to print things in
callbacks <wink/sigh>).