[Python-ideas] Python memory allocation in embedded interpreters

Tim Lesher tlesher at gmail.com
Thu Aug 25 19:02:19 CEST 2011

Issue 3329 (http://bugs.python.org/issue3329, "API for setting the
memory allocator used by Python") called for the ability to redirect
Python's memory allocation at the lowest level from the C runtime
malloc/realloc/free to a user-supplied allocator. The consensus seemed
to turn to a set of macros, similar to the (now-deprecated) PyMem_XXXX
family of macros, that can redirect to a user-supplied allocator based
on a compile-time switch, without forcing an indirection on "normal"
builds of Python.

Additionally, in the comments, Jukka noted that there's still an issue
with static pointers that retain their values across multiple
Py_Initialize()/Py_Finalize() invocations, which causes problems when
a program that embeds a Python interpreter wants to segregate the
memory usage of each interpreter.

That issue's been dormant for exactly two years today (happy
anniversary, I guess), but I know that at least two projects (the
Nokia S60 port and Vocollect's CE port) have had to implement the
first issue, and fought with the second issue.

The first one is pretty straightforward (mostly just replacing
malloc/realloc/free calls with the appropriate macro), but the second
is a little more complicated.  The typical scenario in CPython code
looks like this:

static PyObject* someDict;

PyObject* getSomeDict()
    if (! someDict) {
        someDict = PyDict_New();
        /* initialize someDict */

    return someDict;

This "leaks" someDict, and worse, a later PyInitialize/PyFinalize call
will reuse the pointer without reallocating it, which dies if (in the
meantime) the second PyInitialize uses a different allocator (in our
case, a private segregated heap to avoid fragmentation).

One way to fix this would be to "register" static PyObject pointers so
that PyFinalize() could reset them to NULL. The usage is pretty

static PyObject* someDict;

PyObject* getSomeDict()
    if (! someDict) {
        someDict = PyDict_New();
        /* initialize someDict */

    return someDict;

It's still a manual step, but I don't see an obvious way around that
in C (C++ would do registration-on-construction).

Thoughts?  If it seems reasonable, I'll turn our local implementation
into a patch set to address this.

Tim Lesher <tlesher at gmail.com>

More information about the Python-ideas mailing list