High-quality memory profiling for numpy in python 3.5 / volunteers needed
Hey all, The well known memory_profiler module [1] is super-useful, but has a fundamental limitation, which is the only way it can track allocations is by constantly polling the OS for the size of the total process address space. This is a crude and unreliable way of making measurements. In Python 3.4, there's a new "allocation hooks" infrastructure, that allows one to precisely track the lifetime and size of every allocation [2]. So this is pretty awesome, and we can expect there will be more tools growing up around this interface. But unfortunately, this system is useless for numpy right now, because numpy does not use the Python memory allocation interface; this means that numpy data is "invisible" to tracemalloc and related tools. Why doesn't numpy use the Python memory allocation interface? Because numpy needs calloc(), but Python doesn't expose calloc(), only malloc()/realloc()/free(). Good news, though! python-dev is in favor of adding calloc() to the core allocation interfaces, which will let numpy join the party. See python-dev thread: https://mail.python.org/pipermail/python-dev/2014-April/133985.html It would be especially nice if we could get this into 3.5, since it seems likely that lots of numpy users will be switching to 3.5 when it comes out, and having a good memory tracing infrastructure there waiting for them make it even more awesome. Anyone interested in picking this up? http://bugs.python.org/issue21233 -n [1] https://pypi.python.org/pypi/memory_profiler [2] https://docs.python.org/3.4/library/tracemalloc.html -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
participants (5)
-
Aron Ahmadia
-
Francesc Alted
-
Julian Taylor
-
Nathaniel Smith
-
R Hattersley