[Numpy-discussion] Memory allocation cleanup

Nathaniel Smith njs at pobox.com
Thu Jan 9 21:48:25 EST 2014

On Thu, Jan 9, 2014 at 11:21 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Apropos Julian's changes to use the PyObject_* allocation suite for some
> parts of numpy, I posted the following
> I think numpy memory management is due a cleanup. Currently we have
> PyDataMem_*
> PyDimMem_*
> PyArray_*
> Plus the malloc, PyMem_*, and PyObject_* interfaces. That is six ways to
> manage heap allocations. As far as I can tell, PyArray_* is always PyMem_*
> in practice. We probably need to keep the PyDataMem family as it has a
> memory tracking option, but PyDimMem just confuses things, I'd rather just
> use PyMem_* with explicit size. Curiously, the PyObject_Malloc family is not
> documented apart from some release notes.
> We should also check for the macro versions of PyMem_* as they are
> deprecated for extension modules.
> Nathaniel then suggested that we consider going all Python allocators,
> especially as new memory tracing tools are coming online in 3.4. Given that
> these changes could have some impact on current extension writers I thought
> I'd bring this up on the list to gather opinions.
> Thoughts?

After a bit more research, some further points to keep in mind:

Currently, PyDimMem_* and PyArray_* are just aliases for malloc/free,
and PyDataMem_* is an alias for malloc/free with some extra tracing
hooks wrapped around it. (AFAIK, these tracing hooks are not used by
anyone anywhere -- at least, if they are I haven't heard about it, and
there is no code on github that uses them.)

There is one substantial difference between the PyMem_* and PyObject_*
interfaces as compared to malloc(), which is that the Py* interfaces
require that the GIL be held when they are called. (@Julian -- I think
your PR we just merged fulfills this requirement, is that right?) I
strongly suspect that we have PyDataMem_* calls outside of the GIL --
e.g., when allocating ufunc buffers -- and third-party code might as

Python 3.4's new memory allocation API and tracing stuff is documented here:

In particular, 3.4 adds a set of PyRawMem_* functions, which do not
require the GIL. Checking the current source code for _tracemalloc.c,
it appears that PyRawMem_* functions *are* traced, so that's nice -
that means that switching PyDataMem_* to use PyRawMem_* would be both
safe and provide benefits. However, PyRawMem_* does not provide the
pymalloc optimizations for small allocations.

Also, none of the Py* interfaces implement calloc(), which is annoying
because it messes up our new optimization of using calloc() for
np.zeros. (calloc() is generally faster than malloc()+explicit
zeroing, because it can use OS-specific virtual memory tricks to zero
out the memory "for free". These same tricks also mean that if you use
np.zeros() to allocate a large array, and then only write to a few
entries in that array, the total memory used is proportional to the
number of non-zero entries, rather than to the actual size of the
array, which can be extremely useful in some situations as a kind of
"poor man's sparse array".)

I'm pretty sure that the vast majority of our allocations do occur
with GIL protection, so we might want to switch to using PyObject_*
for most cases to take advantage of the small-object optimizations,
and use PyRawMem_* for any non-GIL cases (like possibly ufunc
buffers), with a compatibility wrapper to replace PyRawMem_* with
malloc() on pre-3.4 pythons. Of course this will need some profiling
to see if PyObject_* is actually better than malloc() in practice. For
calloc(), we could try and convince python-dev to add this, or
np.zeros() could explicitly use calloc() even when other code uses Py*
interface and then uses an ndarray flag or special .base object to
keep track of the fact that we need to use free() to deallocate this
memory, or we could give up on the calloc optimization.


More information about the NumPy-Discussion mailing list