On Thu, 2021-07-22 at 16:48 +0000, Daniel Waddington wrote:
Hi, I'm working with Numpy in the context of supporting different memory types such as persistent memory and CXL attached. I would like to propose a minor change, but figured I would get some initial feedback from the developer community before submitting a PR.
Hi Daniel, you may want to have a look at Matti's NEP to allow custom allocation strategies: https://numpy.org/neps/nep-0049.html When implemented, this will allow to explicitly modify the behaviour here (which means you could make it use the Python version). In principle, once that work is done, we could also use the Python allocator as you are proposing. It may be a follow-up discussion. The difficulty is that the NumPy ABI is fully open: 1. A user can create an array with data they allocated 2. In theory, a user could `realloc` or even replace an arrays `data` In practice, hopefully nobody does the second one, but we can't be sure. The first means we have to wait for the NEP, because it will allow us to work around the problem: We can use different `free`/`realloc` if a user provided the data. The second means that we have to be careful when consider changing the default even after implementing the NEP. But it may be possible, at least if we do it slowly/gently. Cheers, Sebastian
In multiarray/alloc.c the allocator (beneath the cache) using the POSIX malloc/calloc/realloc/free. I propose that these should be changed to PyMem_RawXXX equivalents. The reason for this is that by doing so, one can use the python custom allocator functions (e.g. PyMem_GetAllocator/PyMem_SetAllocator) to intercept the memory allocator for NumPy arrays. This will be useful as heterogeneous memories need supporting. I don't think this will drastically change performance but it is an extra function redirection (and it will only impact when the cache can't deliver). There are likely other places in NumPy that could do with a rinse and repeat - may be someone could advise? Thanks, Daniel Waddington IBM Research --- Example patch for 1.19.x (I'm building with Python3.6) diff --git a/numpy/core/src/multiarray/alloc.c b/numpy/core/src/multiarray/alloc.c index 795fc7315..e9e888478 100644 --- a/numpy/core/src/multiarray/alloc.c +++ b/numpy/core/src/multiarray/alloc.c @@ -248,7 +248,7 @@ PyDataMem_NEW(size_t size) void *result; assert(size != 0); - result = malloc(size); + result = PyMem_RawMalloc(size); if (_PyDataMem_eventhook != NULL) { NPY_ALLOW_C_API_DEF NPY_ALLOW_C_API @@ -270,7 +270,7 @@ PyDataMem_NEW_ZEROED(size_t size, size_t elsize) { void *result; - result = calloc(size, elsize); + result = PyMem_RawCalloc(size, elsize); if (_PyDataMem_eventhook != NULL) { NPY_ALLOW_C_API_DEF NPY_ALLOW_C_API @@ -291,7 +291,7 @@ NPY_NO_EXPORT void PyDataMem_FREE(void *ptr) { PyTraceMalloc_Untrack(NPY_TRACE_DOMAIN, (npy_uintp)ptr); - free(ptr); + PyMem_RawFree(ptr);
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion