[Python-Dev] RE: CVS Python is unstable

Fred L. Drake, Jr. fdrake@acm.org
Fri, 23 Mar 2001 01:50:21 -0500 (EST)


Tim Peters writes:
 > That is, in debug mode, the prev and next fields are nulled out, but not in
 > release mode.
 > 
 > Whenever this thing dies, the node passed in has prev and next fields that
 > *are* nulled out.  Since under MS debug mode, freed memory is set to a very
 > distinctive non-null bit pattern, this tells me that-- most likely --some
 > single node is getting passed to gc_list_remove *twice*.
 > 
 > I bet that's happening in release mode too ... hang on a second ... yup!  If
 > I remove the #ifdef above, then the pair test_weakref test_xmllib dies with a
 > null-pointer error here under the release build too.

  Ok, I've been trying to keep up with all this, and playing with some
alternate patches.  The change that's been identified as causing the
problem was trying to remove the weak ref from the cycle detectors set
of known containers as soon as the ref object was no longer a
container.  When this is done by the tp_clear handler may be the
problem; the GC machinery is removing the object from the list, and
calls gc_list_remove() assuming that the object is still in the list,
but after the tp_clear handler has been called.
  I see a couple of options:

  - Document the restriction that PyObject_GC_Fini() should not be
    called on an object while it's tp_clear handler is active (more
    efficient), -or-
  - Remove the restriction (safer).

  If we take the former route, I think it is still worth removing the
weakref object from the GC list as soon as it has been cleared, in
order to keep the number of containers the GC machinery has to inspect
at a minimum.  This can be done by adding a flag to
weakref.c:clear_weakref() indicating that the object's tp_clear is
active.  The extra flag would not be needed if we took the second
option.
  Another possibility, if I do adjust the code to remove the weakref
objects from the GC list aggressively, is to only call
PyObject_GC_Init() if the weakref actually has a callback -- if there
is no callback, the weakref object does not act as a container to
begin with.
  (It is also possible that with agressive removal of the weakref
object from the set of containers, it doesn't need to implement the
tp_clear handler at all, in which case this gets just a little bit
nicer.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations