[Numpy-discussion] Proposal to accept NEP 49: Data allocation strategies

Sebastian Berg sebastian at sipsolutions.net
Tue May 11 17:46:19 EDT 2021


On Tue, 2021-05-11 at 09:54 +0100, Eric Wieser wrote:
> > Yes, sorry, had been a while since I had looked it up:
> > 
> > https://docs.python.org/3/c-api/memory.html#c.PyMemAllocatorEx
> 
> That `PyMemAllocatorEx` looks almost exactly like one of the two
> variants I
> was proposing. Is there a reason for wanting to define our own
> structure vs
> just using that one?
> I think the NEP should at least offer a brief comparison to that
> structure,
> even if we ultimately end up not using it.
> 
> > That all looks like it can be customized in theory. But I am not
> > sure
> > that it is practical, except for hooking and calling the previous
> > one.
> 
> Is chaining allocators not likely something we want to support too?
> For
> instance, an allocator that is used for large arrays, but falls back
> to the
> previous one for small arrays?
> 
> > I have to say it feels a bit
> > like exposing things publicly, that are really mainly used
> > internally,
> > but not sure...  Presumably Python uses the `ctx` for something
> > though.
> 
> I'd argue `ctx` / `baton` / `user_data` arguments are an essential
> part of
> any C callback API.
> I can't find any particularly good reference for this right now, but
> I have
> been bitten multiple times by C APIs that forget to add this
> argument.


Can't argue with that :).

I am personally still mostly a bit concerned that we have some way to
modify/extend in the future (even clunky seems fine).
Beyond that, I don't care all that much.  Passing a context feels right
to me, but neither do I know that we need it.


Using PyObject still feels a bit much, but I am also not opposed.  I
guess for future extension, we would have to subclass ourselves and/or
include an ABI version number (if just to avoid `PyObject_TypeCheck`
calls to figure out which ABI version we got).


Otherwise, either allocating the struct or including a version number
(or reserved space) in the struct/PyObject is probably good enough to
to ensure we have a path for modifying/extending the ABI.


I hope that the actual end-users can chip in and clear it up a bit...

Cheers,

Sebastian



> 
> >  If someone wants a different strategy (i.e. different alignment)
> > they
> create a new policy
> 
> The crux of the problem here is that without very nasty hacks, C and
> C++ do
> not allow new functions to be created at runtime.
> This makes it very awkward to write a parameterizable allocator. If
> you
> want to create two aligned allocators with different alignments, and
> you
> don't have a `ctx` argument to plumb through that alignment
> information,
> you're forced to write the entire thing twice.
> 
> > I guess the C++ similarity may be a reason, but I am not familiar
> > with
> that.
> 
> Similarity isn't the only motivation - I was considering
> compatibility.
> Consider a user who's already written a shiny stateful C++ allocator,
> and
> wants to use it with numpy.
> I've made a gist at
> https://gist.github.com/eric-wieser/6d0fde53fc1ba7a2fa4ac208467f2ae5 
> which
> demonstrates how to hook an arbitrary C++ allocator into this new
> numpy
> allocator API, that compares both the NEP version and the version
> with an
> added `ctx` argument.
> The NEP version has a bug that is very hard to fix without
> duplicating the
> entire `numpy_handler_from_cpp_allocator` function.
> 
> If compatibility with C++ seems too much of a stretch, the NEP API is
> not
> even compatible with `PyMemAllocatorEx`.
> 
> > But right now the proposal says this is static, and I honestly
> > don't
> > see much reason for it to be freeable?  The current use-cases
> > `cupy` or
> > `pnumpy` don't not seem to need it.
> 
> I don't know much about either of these use cases, so the following
> is
> speculative.
> In cupy, presumably the application is to tie allocation to a
> specific GPU
> device.
> Presumably then, somewhere in the python code there is a handle to a
> GPU
> object, through which the allocators operate.
> If that handle is stored in the allocator, and the allocator is
> freeable,
> then it is possible to write code that automatically releases the GPU
> handle after the allocator has been restored to the default and the
> last
> array using it is cleaned up.
> 
> If that cupy use-case seems somwhat plausible, then I think we should
> go
> with the PyObject approach.
> If it doesn't seem plausible, then I think the `ctx` approach is
> acceptable, and we should consider declaring our struct
> ```struct { PyMemAllocatorEx allocator; char const *name; }``` to
> reuse the
> existing python API unless there's a reason not to.
> 
> Eric
> 
> 
> 
> 
> On Tue, 11 May 2021 at 04:58, Matti Picus <matti.picus at gmail.com>
> wrote:
> 
> > On 10/5/21 8:43 pm, Sebastian Berg wrote:
> > 
> > > But right now the proposal says this is static, and I honestly
> > > don't
> > > see much reason for it to be freeable?
> > 
> > 
> > I think this is the crux of the issue. The current design is for a
> > singly-allocated struct to be passed around since it is just an
> > aggregate of functions. If someone wants a different strategy (i.e.
> > different alignment) they create a new policy: there are no
> > additional
> > parameters or data associated with the struct. I don't really see
> > an ask
> > from possible users for anything more, and so would prefer to
> > remain
> > with the simplest possible design. If the need arises in the future
> > for
> > additional data, which is doubtful, I am confident we can expand
> > this as
> > needed, and do not want to burden the current design with unneeded
> > optional features.
> > 
> > 
> > It would be nice to hear from some actual users if they need the
> > flexibility.
> > 
> > 
> > In any case I would like to resolve this quickly and get it into
> > the
> > next release, so if Eric is adamant that the advanced design is
> > needed I
> > will accept his proposal, since that seems easier than any of the
> > alternatives so far.
> > 
> > 
> > Matti
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion




More information about the NumPy-Discussion mailing list