[Python-bugs-list] [ python-Bugs-742911 ] Memory fault on complex weakref/weakkeydict delete

SourceForge.net noreply@sourceforge.net
Wed, 28 May 2003 15:16:45 -0700


Bugs item #742911, was opened at 2003-05-24 18:29
Message generated for change (Comment added) made by gvanrossum
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=742911&group_id=5470

Category: Python Interpreter Core
Group: Python 2.2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Mike C. Fletcher (mcfletch)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Memory fault on complex weakref/weakkeydict delete

Initial Comment:
Attached find two modules which together form a
test-case.  The cache.py file is ripped out of a
production system (OpenGLContext), and I am seeing
memory faults under both Python 2.2.2 and 2.2.3 when I
run the code.  Under 2.2.2 while single-stepping
through the code I was able to provoke an error-message:

Fatal Python error: GC object already in linked list

The error message doesn't show up under 2.2.3, but the
memory-fault does.

Modules here don't use any extension modules, so there
shouldn't be any loose memory references or the like. 
Note, you'll likely need to make weakkeydictionary's
__delitem__ use keys instead of iterkeys to even get to
the crashing.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-05-28 18:16

Message:
Logged In: YES 
user_id=6380

Tim, let's look at this when you're back in the office. My
head spins from just reading the analysis below.

Note that this is a 2.2 and 2.3 bug. I don't necessarily
want to hold up the 2.2.3 release until this is fixed,
unless we have a quick breakthrough.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-05-25 00:49

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, because I suspect the problem lies in 
the order subtype_dealloc does things.

With reference to Neal's whittled-down case:  when 
makeSome() ends, we decref i and then decref item.  item's 
refcount hits 0 then.  There's a weakref remaining to item 
(in CacheHolder.client), but subtype_dealloc doesn't clear 
the weakref at this point.  First it clears item's instance 
dict.  That contains the last strong reference to i.  
subtype_dealloc is inovked again, and clears i's instance 
dict, and then deals with the weak reference to i.  The 
weakref to i has a callback associated with it, and 
CacheHolder.__call__() is invoked.  That invokes self.client
(), still a weakref to item, and because item's weakrefs still 
haven't been dealt with, self.client() returns item.

Now we're hosed.  item *had* a refcount of 0 at this point, 
and is still in the process of getting cleaned out by the first 
call to subtype_dealloc (it still thinks it's clearing item's 
instance dict).  We already called _Py_ForgetReference on 
item when its refcount fell to 0.  Its refcount gets boosted 
back to 1 by virtue of item getting returned by the 
self.client() weakref call.  Cleaning out the frame for 
CacheHolder.__call__() knocks the refcount down to 0 
again, and the second attempt to call _Py_ForgetReference 
on it blows up.

In a release build, nothing visibly bad happens when I try 
it.  It crashes if I add

        print client.items

at the end of __call__ in a release-build run, though.  Looks 
like that's just the luck of the draw after things have gone 
insane.

I note that instance_dealloc deals with weakrefs much 
earlier in the process, so that when Neal makes Items a 
classic class instead, the eventual call to self.client() 
returns None (instead of items), and nothing bad happens.. 

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-05-24 22:53

Message:
Logged In: YES 
user_id=31435

Outstanding, Neal -- thanks!  I can confirm that this crashes 
in a current 2.3 debug build on Windows too.  I'm feeling 
sick and won't pursue it now, though.  When cleaning up 
from the call to makeSome() (the body of makeSome() has 
completed, and we're cleaning up its execution frame, 
decref'ing the locals), we're dying in _Py_ForgetReference 
with a NULL-pointer derefernce.  The refcount on an Items 
object has just fallen to 0, and the code is trying to verify 
that the object is in the debug-build "list of all objects".  But 
its prev and next pointers are both NULL -- it's not in the 
list, and simply trying to check that it is blows up.

I don't have a theory for what's causing this, but it's 
probably not a good thing <heh>.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-05-24 19:31

Message:
Logged In: YES 
user_id=33168

I cut out a lot of stuff from the test.  The new file is
much more minimal, but still crashes for me in a 2.3 debug
build.  You only need the one file too (not both files).

There is an issue with new style classes.  If Items doesn't
derive from object, I don't get a crash.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=742911&group_id=5470