[Numpy-discussion] How to debug reference counting errors
ondrej.certik at gmail.com
Fri Aug 31 20:35:49 EDT 2012
On Fri, Aug 31, 2012 at 4:22 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 08/31/2012 09:03 AM, Ondřej Čertík wrote:
>> There is segfault reported here:
>> I've managed to isolate the problem and even provide a simple patch,
>> that fixes it here:
>> however the patch simply doesn't decrease the proper reference, so it
>> might leak. I've used
>> bisection (took the whole evening unfortunately...) but the good news
>> is that I've isolated commits
>> that actually broke it. See the github issue #398 for details, diffs etc.
>> Unfortunately, it's 12 commits from Mark and the individual commits
>> raise exception on the segfaulting code,
>> so I can't pin point the problem further.
>> In general, how can I debug this sort of problem? I tried to use
>> valgrind, with a debugging build of numpy,
>> but it provides tons of false (?) positives: https://gist.github.com/3549063
>> Mark, by looking at the changes that broke it, as well as at my "fix",
>> do you see where the problem could be?
>> I suspect it is something with the changes in PyArray_FromAny() or
>> PyArray_FromArray() in ctors.c.
>> But I don't see anything so far that could cause it.
>> Thanks for any help. This is one of the issues blocking the 1.7.0 release.
> IIRC you can recompile Python with some support for detecting memory
> leaks. One of the issues with using Valgrind, after suppressing the
> false positives, is that Python uses its own memory allocator so that
> sits between the bug and what Valgrind detects. So at least recompile
> Python to not do that.
Right. Compiling with "--without-pymalloc" (per README.valgrind as suggested
above by Richard) should improve things a lot. Thanks for the tip.
> As for hardening the NumPy source in general, you should at least be
> aware of these two options:
> 1) David Malcolm (dmalcolm at redhat.com) was writing a static code
> analysis plugin for gcc that would check every routine that the
> reference count semantics was correct. (I don't know how far he's got
> with that.)
> 2) In Cython we have a "reference count nanny". This requires changes to
> all the code though, so not an option just for finding this bug, just
> thought I'd mention it. In addition to the INCREF/DECREF you need to
> insert new "GIVEREF" and "GOTREF" calls (which are noops in a normal
> compile) to declare where you get and give away a reference. When
> Cython-generated sources are enabled with -DCYTHON_REFNANNY,
> INCREF/DECREF/GIVEREF/GOTREF are tracked within each function and a
> failure is raised if the function violates any contract.
I see. That's a nice option. For my own code, I never touch the
by hand and rather just use Cython.
In the meantime, Mark fixed it:
Mark, thanks again for this. That saved me a lot of time.
More information about the NumPy-Discussion