A memory leak involving builtin/pure-python inheritance.
nejucomo at gmail.com
nejucomo at gmail.com
Tue Nov 14 07:58:07 CET 2006
A test script demonstrates a memory leak when I use pythonic extensions
of my builtin types, but if I use the builtin types themselves there is
no memory leak.
If you are interested in how builtin/pure-python inheritance interacts
with the gc, maybe you can help me fix my memory leak.
I'm creating an extension module (see ) which implements the LADSPA
host interface (see ). A similar project appears to do the same
using pyrex (see ), but I have not investigated it yet.
I've discovered a memory leak and I'm fairly certain of the where
the culprit reference cycle is. The problem is I am not sure how
to instruct the gc to handle it, even after reading about supporting
builtin containers (see ).
How do I correctly implement a builtin type and its pythonic extension
type such that reference cycles are detected and collected?
There are three relevant LADSPA types:
A "handle" is an audio processing module which operates on input and
A "descriptor" describes the interface to a handle such as the types
in the I/O streams, and their names and descriptions. A handle is the
instantiation of a particular descriptor.
A "plugin" is a container providing zero or more descriptors which can
be dynamically loaded. (Typically a shared object library that the
LADSPA host loads at runtime.)
The python-ladspa package is designed as follows:
There is a low-level builtin extension module, called "_ladspa", and a
high-level interface module, called "ladspa".
For each of the three LADSPA types, there is an extension type in the
"_ladpsa" module, for example: "_ladspa.Descriptor".
For each extension type there is a high-level subtype in the "ladspa"
module. So "ladspa.Handle" inherits from "_ladspa.Handle".
The reference structure of the "_ladspa" module is tree-like with
no cycles. This is because a handle has a single reference to its
descriptor, and a descriptor has a single reference to its plugin.
However, the high-level interface introduces a reference cycle because
this makes the interface more natural, IMO.
I've created a diagram which attempts to represent the inheritance
relationships as well as the reference structure of this wrapper (see
). Let me know if you find it clarifies things or needs
There is a test script, "memleak.py", which tests either module by
repeatedly instantiating and discarding references to handles (see
When configured to use the "_ladspa" module, there appears to be no
memory usage growth, but if using the "ladspa" module, memory grows
linearly with the number of iterations.
If I comment out the "Descriptors" list in the "ladspa.Plugin" class
(which removes the reference cycle) then "memleak.py" runs with no
apparent memory leak.
The leak persists even if I implement traverse and clear methods for
all three builtin types, unless I've done this incorrectly.
Of course one solution is to do away with the reference cycle. After
the _ladspa extension does not have the cycle and is usable. However I
care more about a user-friendly interface (which I believe the
cycle provides) and also I'm just curious.
Thanks for any help,
More information about the Python-list