Enhancement of Python memory allocators

Hi, I would like to improve memory allocators of Python. My two use cases are replacing memory allocators with custom allocators in embedded system and hooking allocators to track usage of memory. I wrote a patch for this, I'm going to commit it if nobody complains: http://bugs.python.org/issue3329 Using this patch, detecting memory corruptions (buffer underflow and overflow) can be done without recompilation. We may add an environment variable to enable Python debug functions at runtime, example: PYDEBUGMALLOC=1. There is just a restriction: the environment variable would not be ignored with -E command line option, because command line options are parsed after the first memory allocation. What do you think? ***** The patch adds the following functions: void PyMem_GetAllocators( void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void* (**realloc_p) (void *ctx, void *ptr, size_t size), void (**free_p) (void *ctx, void *ptr)); void PyMem_SetAllocators( void *ctx, void* (*malloc) (void *ctx, size_t size), void* (*realloc) (void *ctx, void *ptr, size_t size), void (*free) (void *ctx, void *ptr)); It adds 4 similar functions (get/set) for PyObject_Malloc() and allocators of pymalloc arenas. ***** For the "track usage of memory" use case, see the following project which hooks memory allocators using PyMem_SetAllocators() and PyObject_SetAllocators() to get allocated bytes per filename and line number. https://pypi.python.org/pypi/pytracemalloc ***** Another issue proposes to use VirtualAlloc() and VirtualFree() for pymalloc arenas, see: http://bugs.python.org/issue13483 I don't know if it would be interesting, but it would now possible to choose the memory allocator (malloc, mmap, HeapAlloc, VirtualAlloc, ...) at runtime, with an environment variable for example. Victor

On 13 Jun 2013 09:09, "Victor Stinner" <victor.stinner@gmail.com> wrote:
Hi,
I would like to improve memory allocators of Python. My two use cases are replacing memory allocators with custom allocators in embedded system and hooking allocators to track usage of memory.
I wrote a patch for this, I'm going to commit it if nobody complains: http://bugs.python.org/issue3329
Using this patch, detecting memory corruptions (buffer underflow and overflow) can be done without recompilation. We may add an environment variable to enable Python debug functions at runtime, example: PYDEBUGMALLOC=1. There is just a restriction: the environment variable would not be ignored with -E command line option, because command line options are parsed after the first memory allocation. What do you think?
The rest of it sounds fine, but please don't add the runtime switching support to our existing main function. Interpreter startup is a mess already. If you were interested in helping directly with PEP 432, though, that would be good - I haven't been able to spend much time on it lately. Cheers, Nick.
*****
The patch adds the following functions:
void PyMem_GetAllocators( void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void* (**realloc_p) (void *ctx, void *ptr, size_t size), void (**free_p) (void *ctx, void *ptr));
void PyMem_SetAllocators( void *ctx, void* (*malloc) (void *ctx, size_t size), void* (*realloc) (void *ctx, void *ptr, size_t size), void (*free) (void *ctx, void *ptr));
It adds 4 similar functions (get/set) for PyObject_Malloc() and allocators of pymalloc arenas.
*****
For the "track usage of memory" use case, see the following project which hooks memory allocators using PyMem_SetAllocators() and PyObject_SetAllocators() to get allocated bytes per filename and line number. https://pypi.python.org/pypi/pytracemalloc
*****
Another issue proposes to use VirtualAlloc() and VirtualFree() for pymalloc arenas, see: http://bugs.python.org/issue13483
I don't know if it would be interesting, but it would now possible to choose the memory allocator (malloc, mmap, HeapAlloc, VirtualAlloc, ...) at runtime, with an environment variable for example.
Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

2013/6/13 Nick Coghlan <ncoghlan@gmail.com>:
On 13 Jun 2013 09:09, "Victor Stinner" <victor.stinner@gmail.com> wrote:
Using this patch, detecting memory corruptions (buffer underflow and overflow) can be done without recompilation. We may add an environment variable to enable Python debug functions at runtime, example: PYDEBUGMALLOC=1. There is just a restriction: the environment variable would not be ignored with -E command line option, because command line options are parsed after the first memory allocation. What do you think?
The rest of it sounds fine, but please don't add the runtime switching support to our existing main function. Interpreter startup is a mess already. If you were interested in helping directly with PEP 432, though, that would be good - I haven't been able to spend much time on it lately.
I proposed an environment variable to solve the following issue: when memory allocators are replaced with custom allocators, debug hooks cannot be used. Debug hooks must be set before the first memory allocation. Another option is to add a new function (ex: PyMem_SetDebugHook()) to install explicitly debug hooks, so it can be called after PyMem_SetAllocators() and before the first memory allocation. Victor

On 13 Jun 2013 10:09, "Victor Stinner" <victor.stinner@gmail.com> wrote:
2013/6/13 Nick Coghlan <ncoghlan@gmail.com>:
On 13 Jun 2013 09:09, "Victor Stinner" <victor.stinner@gmail.com> wrote:
Using this patch, detecting memory corruptions (buffer underflow and overflow) can be done without recompilation. We may add an environment variable to enable Python debug functions at runtime, example: PYDEBUGMALLOC=1. There is just a restriction: the environment variable would not be ignored with -E command line option, because command line options are parsed after the first memory allocation. What do you think?
The rest of it sounds fine, but please don't add the runtime switching support to our existing main function. Interpreter startup is a mess already. If you were interested in helping directly with PEP 432,
though,
that would be good - I haven't been able to spend much time on it lately.
I proposed an environment variable to solve the following issue: when memory allocators are replaced with custom allocators, debug hooks cannot be used. Debug hooks must be set before the first memory allocation.
Another option is to add a new function (ex: PyMem_SetDebugHook()) to install explicitly debug hooks, so it can be called after PyMem_SetAllocators() and before the first memory allocation.
Yes, that sounds better. One of the biggest problems with the current startup sequence is the way it relies on environment variables for configuration, which makes life hard for other applications that want to embed the CPython runtime. Cheers, Nick.
Victor

2013/6/13 Nick Coghlan <ncoghlan@gmail.com>:
Yes, that sounds better. One of the biggest problems with the current startup sequence is the way it relies on environment variables for configuration, which makes life hard for other applications that want to embed the CPython runtime.
I wrote a new patch (attached to issue #3329) adding a new PyMem_SetupDebugHooks() function. So if an application replaces Python memory allocator functions, it can still can PyMem_SetupDebugHooks() to benefit of the Python debug hooks detecting memory bugs. The function does nothing if hooks are already installed or if Python is not compiled in debug mode. With this function, the API is now complete for all use cases. The PEP 432 helps to configure embedded Python, but the new "Set" functions (ex: PyMem_SetAllocators) are still needed for my pytracemalloc module which installs hooks at runtime, when Python is fully initialized (the hooks can be installed anytime). pytracemalloc is just an example, you may use PyMem_SetAllocators for other debug or performance purpose. With my patch, allocator functions like PyMem_Malloc() are no more macro, and are always the same function. This helps the stable ABI: C extension modules do not need to be recompiled to benefit of the debug hooks ;-) Victor
participants (2)
-
Nick Coghlan
-
Victor Stinner