![](https://secure.gravatar.com/avatar/72f994ca072df3a3d2c3db8a137790fd.jpg?s=120&d=mm&r=g)
I have submitted NEP 49 to enable user-defined allocation strategies for the ndarray.data homogeneous memory area. The implementation is in PR 17582 https://github.com/numpy/numpy/pull/17582 Here is the text of the NEP: Abstract -------- The ``numpy.ndarray`` requires additional memory allocations to hold ``numpy.ndarray.strides``, ``numpy.ndarray.shape`` and ``numpy.ndarray.data`` attributes. These attributes are specially allocated after creating the python object in ``__new__`` method. The ``strides`` and ``shape`` are stored in a piece of memory allocated internally. This NEP proposes a mechanism to override the memory management strategy used for ``ndarray->data`` with user-provided alternatives. This allocation holds the arrays data and is can be very large. As accessing this data often becomes a performance bottleneck, custom allocation strategies to guarantee data alignment or pinning allocations to specialized memory hardware can enable hardware-specific optimizations. Motivation and Scope -------------------- Users may wish to override the internal data memory routines with ones of their own. Two such use-cases are to ensure data alignment and to pin certain allocations to certain NUMA cores. User who wish to change the NumPy data memory management routines will use :c:func:`PyDataMem_SetHandler`, which uses a :c:type:`PyDataMem_Handler` structure to hold pointers to functions used to manage the data memory. The calls are wrapped by internal routines to call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will use the :c:func:`PyDataMem_EventHookFunc` mechanism already present in NumPy for auditing purposes. Since a call to ``PyDataMem_SetHandler`` will change the default functions, but that function may be called during the lifetime of an ``ndarray`` object, each ``ndarray`` will carry with it the ``PyDataMem_Handler`` struct used at the time of its instantiation, and these will be used to reallocate or free the data memory of the instance. Internally NumPy may use ``memcpy` or ``memset`` on the data ``ptr``. Usage and Impact ---------------- The new functions can only be accessed via the NumPy C-API. An example is included later in the NEP. The added ``struct`` will increase the size of the ``ndarray`` object. It is one of the major drawbacks of this approach. We can be reasonably sure that the change in size will have a minimal impact on end-user code because NumPy version 1.20 already changed the object size. Backward compatibility ---------------------- The design will not break backward compatibility. Projects that were assigning to the ``ndarray->data`` pointer were already breaking the current memory management strategy (backed by ``npy_alloc_cache``) and should restore ``ndarray->data`` before calling ``Py_DECREF``. As mentioned above, the change in size should not impact end-users. Matti
![](https://secure.gravatar.com/avatar/5f88830d19f9c83e2ddfd913496c5025.jpg?s=120&d=mm&r=g)
On Tue, Apr 20, 2021 at 2:18 PM Matti Picus <matti.picus@gmail.com> wrote:
I have submitted NEP 49 to enable user-defined allocation strategies for the ndarray.data homogeneous memory area. The implementation is in PR 17582 https://github.com/numpy/numpy/pull/17582 Here is the text of the NEP:
Thanks Matti!
Abstract --------
The ``numpy.ndarray`` requires additional memory allocations to hold ``numpy.ndarray.strides``, ``numpy.ndarray.shape`` and ``numpy.ndarray.data`` attributes. These attributes are specially allocated after creating the python object in ``__new__`` method. The ``strides`` and ``shape`` are stored in a piece of memory allocated internally.
This NEP proposes a mechanism to override the memory management strategy used for ``ndarray->data`` with user-provided alternatives. This allocation holds the arrays data and is can be very large. As accessing this data often becomes a performance bottleneck, custom allocation strategies to guarantee data alignment or pinning allocations to specialized memory hardware can enable hardware-specific optimizations.
Motivation and Scope --------------------
Users may wish to override the internal data memory routines with ones of their own. Two such use-cases are to ensure data alignment and to pin certain allocations to certain NUMA cores.
It would be great to expand a bit on these two sentences, and add some links. There's a lot of history here in NumPy development to refer to as well: https://numpy-discussion.scipy.narkive.com/MvmMkJcK/numpy-arrays-data-alloca... http://numpy-discussion.10968.n7.nabble.com/Aligned-configurable-memory-allo... http://numpy-discussion.10968.n7.nabble.com/Numpy-s-policy-for-releasing-mem... https://github.com/numpy/numpy/issues/5312 https://github.com/numpy/numpy/issues/14177 There must also be a good amount of ideas/discussion elsewhere. https://bugs.python.org/issue18835 discussed an aligned allocator for Python itself, with fairly detailed discussion about whether/how NumPy could benefit. With (I think) the conclusion it shouldn't be in Python, but NumPy/Arrow/others are better off doing their own thing. I'm wondering if improved memory profiling is a use case as well? Fil ( https://github.com/pythonspeed/filprofiler) for example seems to use such a strategy: https://github.com/pythonspeed/filprofiler/blob/master/design/allocator-over... Does it interact with our tracemalloc support ( https://numpy.org/doc/stable/release/1.13.0-notes.html#support-for-tracemall... )?
User who wish to change the NumPy data memory management routines will use
This is design, not motivation or scope. Try to not refer to specific function names in this section. I suggest moving this content to the "Detailed design" section (or better, a "high level design" at the start of that section). Cheers, Ralf :c:func:`PyDataMem_SetHandler`, which uses a :c:type:`PyDataMem_Handler`
structure to hold pointers to functions used to manage the data memory. The calls are wrapped by internal routines to call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will use the :c:func:`PyDataMem_EventHookFunc` mechanism already present in NumPy for auditing purposes.
Since a call to ``PyDataMem_SetHandler`` will change the default functions, but that function may be called during the lifetime of an ``ndarray`` object, each ``ndarray`` will carry with it the ``PyDataMem_Handler`` struct used at the time of its instantiation, and these will be used to reallocate or free the data memory of the instance. Internally NumPy may use ``memcpy` or ``memset`` on the data ``ptr``.
Usage and Impact ----------------
The new functions can only be accessed via the NumPy C-API. An example is included later in the NEP. The added ``struct`` will increase the size of the ``ndarray`` object. It is one of the major drawbacks of this approach. We can be reasonably sure that the change in size will have a minimal impact on end-user code because NumPy version 1.20 already changed the object size.
Backward compatibility ----------------------
The design will not break backward compatibility. Projects that were assigning to the ``ndarray->data`` pointer were already breaking the current memory management strategy (backed by ``npy_alloc_cache``) and should restore ``ndarray->data`` before calling ``Py_DECREF``. As mentioned above, the change in size should not impact end-users.
Matti
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
![](https://secure.gravatar.com/avatar/72f994ca072df3a3d2c3db8a137790fd.jpg?s=120&d=mm&r=g)
See my comments interspersed in Ralf's reply. Thanks for the additional context. Matti On 21/4/21 3:10 am, Ralf Gommers wrote:
...
Motivation and Scope --------------------
Users may wish to override the internal data memory routines with ones of their own. Two such use-cases are to ensure data alignment and to pin certain allocations to certain NUMA cores.
It would be great to expand a bit on these two sentences, and add some links. There's a lot of history here in NumPy development to refer to as well:
https://numpy-discussion.scipy.narkive.com/MvmMkJcK/numpy-arrays-data-alloca... <https://numpy-discussion.scipy.narkive.com/MvmMkJcK/numpy-arrays-data-alloca...> http://numpy-discussion.10968.n7.nabble.com/Aligned-configurable-memory-allo... <http://numpy-discussion.10968.n7.nabble.com/Aligned-configurable-memory-allo...> http://numpy-discussion.10968.n7.nabble.com/Numpy-s-policy-for-releasing-mem... <http://numpy-discussion.10968.n7.nabble.com/Numpy-s-policy-for-releasing-mem...> https://github.com/numpy/numpy/issues/5312 <https://github.com/numpy/numpy/issues/5312> https://github.com/numpy/numpy/issues/14177 <https://github.com/numpy/numpy/issues/14177>
There must also be a good amount of ideas/discussion elsewhere.
I added more context to this section, trying to focus on the large data allocations in NumPy.
https://bugs.python.org/issue18835 <https://bugs.python.org/issue18835> discussed an aligned allocator for Python itself, with fairly detailed discussion about whether/how NumPy could benefit. With (I think) the conclusion it shouldn't be in Python, but NumPy/Arrow/others are better off doing their own thing.
I'm wondering if improved memory profiling is a use case as well? Fil (https://github.com/pythonspeed/filprofiler <https://github.com/pythonspeed/filprofiler>) for example seems to use such a strategy: https://github.com/pythonspeed/filprofiler/blob/master/design/allocator-over... <https://github.com/pythonspeed/filprofiler/blob/master/design/allocator-over...>
Thanks. I added a sentence about this as well.
Does it interact with our tracemalloc support (https://numpy.org/doc/stable/release/1.13.0-notes.html#support-for-tracemall... <https://numpy.org/doc/stable/release/1.13.0-notes.html#support-for-tracemalloc-in-python-3-6>)?
I added a sentence about this. The new C-API wrapper functions preserve the current status vis-a-vis tracemalloc support. I am not sure that support is complete. The NEP should not change the situation for better or worse.
User who wish to change the NumPy data memory management routines will use
This is design, not motivation or scope. Try to not refer to specific function names in this section. I suggest moving this content to the "Detailed design" section (or better, a "high level design" at the start of that section).
Done.
Cheers, Ralf
participants (2)
-
Matti Picus
-
Ralf Gommers