> Yes, sorry, had been a while since I had looked it up:

That `PyMemAllocatorEx` looks almost exactly like one of the two variants I was proposing. Is there a reason for wanting to define our own structure vs just using that one?
I think the NEP should at least offer a brief comparison to that structure, even if we ultimately end up not using it.

> That all looks like it can be customized in theory. But I am not sure
> that it is practical, except for hooking and calling the previous one.

Is chaining allocators not likely something we want to support too? For instance, an allocator that is used for large arrays, but falls back to the previous one for small arrays?

> I have to say it feels a bit
> like exposing things publicly, that are really mainly used internally,
> but not sure...  Presumably Python uses the `ctx` for something though.

I'd argue `ctx` / `baton` / `user_data` arguments are an essential part of any C callback API.
I can't find any particularly good reference for this right now, but I have been bitten multiple times by C APIs that forget to add this argument.

>  If someone wants a different strategy (i.e. different alignment) they create a new policy

The crux of the problem here is that without very nasty hacks, C and C++ do not allow new functions to be created at runtime.
This makes it very awkward to write a parameterizable allocator. If you want to create two aligned allocators with different alignments, and you don't have a `ctx` argument to plumb through that alignment information, you're forced to write the entire thing twice.

> I guess the C++ similarity may be a reason, but I am not familiar with that.

Similarity isn't the only motivation - I was considering compatibility. Consider a user who's already written a shiny stateful C++ allocator, and wants to use it with numpy.
I've made a gist at https://gist.github.com/eric-wieser/6d0fde53fc1ba7a2fa4ac208467f2ae5 which demonstrates how to hook an arbitrary C++ allocator into this new numpy allocator API, that compares both the NEP version and the version with an added `ctx` argument.
The NEP version has a bug that is very hard to fix without duplicating the entire `numpy_handler_from_cpp_allocator` function.

If compatibility with C++ seems too much of a stretch, the NEP API is not even compatible with `PyMemAllocatorEx`.

> But right now the proposal says this is static, and I honestly don't
> see much reason for it to be freeable?  The current use-cases `cupy` or
> `pnumpy` don't not seem to need it.

I don't know much about either of these use cases, so the following is speculative.
In cupy, presumably the application is to tie allocation to a specific GPU device.
Presumably then, somewhere in the python code there is a handle to a GPU object, through which the allocators operate.
If that handle is stored in the allocator, and the allocator is freeable, then it is possible to write code that automatically releases the GPU handle after the allocator has been restored to the default and the last array using it is cleaned up.

If that cupy use-case seems somwhat plausible, then I think we should go with the PyObject approach.
If it doesn't seem plausible, then I think the `ctx` approach is acceptable, and we should consider declaring our struct
```struct { PyMemAllocatorEx allocator; char const *name; }``` to reuse the existing python API unless there's a reason not to.


On Tue, 11 May 2021 at 04:58, Matti Picus <matti.picus@gmail.com> wrote:
On 10/5/21 8:43 pm, Sebastian Berg wrote:

> But right now the proposal says this is static, and I honestly don't
> see much reason for it to be freeable?

I think this is the crux of the issue. The current design is for a
singly-allocated struct to be passed around since it is just an
aggregate of functions. If someone wants a different strategy (i.e.
different alignment) they create a new policy: there are no additional
parameters or data associated with the struct. I don't really see an ask
from possible users for anything more, and so would prefer to remain
with the simplest possible design. If the need arises in the future for
additional data, which is doubtful, I am confident we can expand this as
needed, and do not want to burden the current design with unneeded
optional features.

It would be nice to hear from some actual users if they need the

In any case I would like to resolve this quickly and get it into the
next release, so if Eric is adamant that the advanced design is needed I
will accept his proposal, since that seems easier than any of the
alternatives so far.


NumPy-Discussion mailing list