gc assertion failure
jmiller at stsci.edu
Wed Oct 29 23:27:50 CET 2003
Todd Miller wrote:
> Tim Peters wrote:
>> [Todd Miller]
>>> I recently discovered an assertion failure in the Python garbage
>>> collection system when scripts using our C extension (numarray) exit.
>>> The assertion is activated for Pythons configured using
>>> --with-pydebug. I have a feeling I may be doing something wrong
>>> with garbage collection support for some of our c types, but I'm not
>>> sure exactly what.
>>> Here is the assertion output:
>>> python: Modules/gcmodule.c:231: visit_decref: Assertion
>>> `gc->gc.gc_refs != 0' failed.
>>> Abort (core dumped)
>> Looking at the source code should clarify:
>> assert(gc->gc.gc_refs != 0); /* else refcount was too small */
>> That is, gc found more pointers to an object than that object's refcount
>> believes exists. A missing Py_INCREF or an extra Py_DECREF are plausible
>> causes; so is a bad tp_traverse function that passes a single containee
>> multiple times (although I've only see that once in real life). A
>> Py_INCREF is (IME) the most common cause for this assertion.
>>> #5 0x080e9222 in visit_decref (op=0x405adc74, data=0x0) at
>>> #6 0x0808cebf in tupletraverse (o=0x40a62f74, visit=0x80e9194
>>> <visit_decref>, arg=0x0) at Objects/tupleobject.c:398
>> So it's complaing about an object that happens to be in a tuple.
>> more info about op would tell you more about the kind of object it's
>> complaining about.
> Thanks Tim! It turns out to be one of the objects numarray uses to
> represent data type, Int64. I also noticed that the problem goes away
> when I switch on the "Python prototype" for some C code, which is
> further evidence that the problem is a ref count error since the code in
> question just touches type objects, it doesn't implement them.
> I haven't found the bug yet, but I'm out of wheel lock. Definitely
> makes my day...
FWIW, here's what my bug looked like:
< key = Py_BuildValue("(NNsNN)", _digest(in1), _digest(out),
cumop, thread_id, type
> key = Py_BuildValue("(NNsNO)", _digest(in1), _digest(out),
cumop, thread_id, type
Since I used "N" for type in the Py_BuildValue, it stole a reference to
type which it shouldn't have. Switching to "O" made the Py_BuildValue
reference count neutral for type and the problem was solved.
Thanks for the help,
More information about the Python-list