A memory leak involving builtin/pure-python inheritance.

nejucomo at gmail.com nejucomo at gmail.com
Tue Nov 14 01:58:07 EST 2006


Hi folks,


Quick Synopsis:

A test script demonstrates a memory leak when I use pythonic extensions
of my builtin types, but if I use the builtin types themselves there is
no memory leak.

If you are interested in how builtin/pure-python inheritance interacts
with the gc, maybe you can help me fix my memory leak.


Background:

I'm creating an extension module (see [1]) which implements the LADSPA
host interface (see [2]).  A similar project appears to do the same
using pyrex (see [3]), but I have not investigated it yet.

I've discovered a memory leak and I'm fairly certain of the where
the culprit reference cycle is.  The problem is I am not sure how
to instruct the gc to handle it, even after reading about supporting
builtin containers (see [4]).


My question:

How do I correctly implement a builtin type and its pythonic extension
type such that reference cycles are detected and collected?


Details:

There are three relevant LADSPA types:

A "handle" is an audio processing module which operates on input and
output streams.

A "descriptor" describes the interface to a handle such as the types
in the I/O streams, and their names and descriptions.  A handle is the
instantiation of a particular descriptor.

A "plugin" is a container providing zero or more descriptors which can
be dynamically loaded.  (Typically a shared object library that the
LADSPA host loads at runtime.)


The python-ladspa package is designed as follows:

There is a low-level builtin extension module, called "_ladspa", and a
high-level interface module, called "ladspa".

For each of the three LADSPA types, there is an extension type in the
"_ladpsa" module, for example: "_ladspa.Descriptor".

For each extension type there is a high-level subtype in the "ladspa"
module.  So "ladspa.Handle" inherits from "_ladspa.Handle".


The reference structure of the "_ladspa" module is tree-like with
no cycles.  This is because a handle has a single reference to its
descriptor, and a descriptor has a single reference to its plugin.

However, the high-level interface introduces a reference cycle because
this makes the interface more natural, IMO.


I've created a diagram which attempts to represent the inheritance
relationships as well as the reference structure of this wrapper (see
[5]).  Let me know if you find it clarifies things or needs
improvement.


There is a test script, "memleak.py", which tests either module by
repeatedly instantiating and discarding references to handles (see
[6]).
When configured to use the "_ladspa" module, there appears to be no
memory usage growth, but if using the "ladspa" module, memory grows
linearly with the number of iterations.

If I comment out the "Descriptors" list in the "ladspa.Plugin" class
(which removes the reference cycle) then "memleak.py" runs with no
apparent memory leak.

The leak persists even if I implement traverse and clear methods for
all three builtin types, unless I've done this incorrectly.


Of course one solution is to do away with the reference cycle.  After
all
the _ladspa extension does not have the cycle and is usable.  However I
care more about a user-friendly interface (which I believe the
reference
cycle provides) and also I'm just curious.


Thanks for any help,
Nejucomo


References:
[1] http://sourceforge.net/projects/python-ladspa/
[2] http://www.ladspa.org/
[3] http://sourceforge.net/projects/dsptools/
[4] http://www.python.org/doc/2.3.5/ext/node24.html
[5]
http://python-ladspa.svn.sourceforge.net/viewvc/python-ladspa/doc/refgraph.png?revision=41&view=markup
[6]
http://python-ladspa.svn.sourceforge.net/viewvc/python-ladspa/test/memleak.py?revision=37&view=markup




More information about the Python-list mailing list