[Python-Dev] Valgrind on 2.2.2

Guido van Rossum guido@python.org
Mon, 28 Oct 2002 20:15:44 -0500


>   This should help shed some light on the situation:

Thanks; this indeed helps.

> Quoth the docs:
> 
> """
> For each such block, Valgrind scans the entire address space of the
> process, looking for pointers to the block. One of three situations
> may result:
> 
>   * A pointer to the start of the block is found. This usually
>   indicates programming sloppiness; since the block is still pointed
>   at, the programmer could, at least in principle, free'd it before
>   program exit.

Booh again.  Lots of globals get initialized with pointers to
malloc'ed blocks that are never freed.  There are never called "leaks"
in other leak detectors, just "alive at exit".  I think valgrind
actually doesn't call these leaks either.

>   * A pointer to the interior of the block is found. The pointer
>   might originally have pointed to the start and have been moved
>   along, or it might be entirely unrelated. Valgrind deems such a
>   block as "dubious", that is, possibly leaked, because it's unclear
>   whether or not a pointer to it still exists.

Aha!  This may be the case.  When an object has a GC header, all
pointers to the object point to an address 12 bytes in the block,
which is where the "object" lay-out begins.  Normally, there should be
at least one pointer to the start of the block from one of the GC
chains, but objects don't have to be in a chain at all.

(I wonder if pymalloc adds to the confusion, since its arenas count as
a single block to malloc and hence to valgrind, but are internally cut
up into many objects.)

>   * The worst outcome is that no pointer to the block can be
>   found. The block is classified as "leaked", because the programmer
>   could not possibly have free'd it at program exit, since no pointer
>   to it exists. This might be a symptom of having lost the pointer at
>   some earlier point in the program.

This is a true leak.

> """
> 
>   Possibly is the second case and definitely lost is the third case.
> The definitely lost, in my experience, tends to mean you just forgot
> to free a pointer. The possibly lost usually means that some memory
> rot occurred, where it's not clear which pointer is causing the mem
> leak.

How much Python extension coding (in C) have you done?  In Python, it
almost never is a matter of forgetting to free() -- it's usually a
matter of forgetting to DECREF, and sometimes a matter of doing an
unnecessary INCREF.

--Guido van Rossum (home page: http://www.python.org/~guido/)