[Numpy-discussion] Proposal to accept NEP 49: Data allocation strategies

Eric Wieser wieser.eric+numpy at gmail.com
Sat May 15 08:06:32 EDT 2021


Note that PEP-445 which introduced `PyMemAllocatorEx` specifically rejected
omitting the `ctx` argument here:
https://www.python.org/dev/peps/pep-0445/#id23, which is another argument
in favor of having it.

I'll try to give a more thorough justification for the pyobject / capsule
suggestion in another message in the next few days.



On Thu, 13 May 2021 at 17:06, eliaskoromilas <elias.koromilas at gmail.com>
wrote:

> Eric Wieser wrote
> >> Yes, sorry, had been a while since I had looked it up:
> >>
> >> https://docs.python.org/3/c-api/memory.html#c.PyMemAllocatorEx
> >
> > That `PyMemAllocatorEx` looks almost exactly like one of the two variants
> > I
> > was proposing. Is there a reason for wanting to define our own structure
> > vs
> > just using that one?
> > I think the NEP should at least offer a brief comparison to that
> > structure,
> > even if we ultimately end up not using it.
> >
> >> I have to say it feels a bit
> >> like exposing things publicly, that are really mainly used internally,
> >> but not sure...  Presumably Python uses the `ctx` for something though.
> >
> > I'd argue `ctx` / `baton` / `user_data` arguments are an essential part
> of
> > any C callback API.
> > I can't find any particularly good reference for this right now, but I
> > have
> > been bitten multiple times by C APIs that forget to add this argument.
> >
> >>  If someone wants a different strategy (i.e. different alignment) they
> > create a new policy
> >
> > The crux of the problem here is that without very nasty hacks, C and C++
> > do
> > not allow new functions to be created at runtime.
> > This makes it very awkward to write a parameterizable allocator. If you
> > want to create two aligned allocators with different alignments, and you
> > don't have a `ctx` argument to plumb through that alignment information,
> > you're forced to write the entire thing twice.
>
> The `PyMemAllocatorEx` memory API will allow (lambda) closure-like
> definition of the data mem routines. That's the main idea behind the `ctx`
> thing, it's huge and will enable every allocation scenario.
>
> In my opinion, the rest of the proposals (PyObjects, PyCapsules, etc.) are
> secondary and could be considered out-of-scope. I would suggest to let
> people use this before hiding it behind a strict API.
>
> Let me also give you an insight of how we plan to do it, since we are the
> first to integrate this in production code. Considering this NEP as a
> primitive API, I developed a new project to address our requirements:
>
> 1. Provide a Python-native way to define a new numpy allocator
> 2. Accept data mem routine symbols (function pointers) from open dynamic
> libraries
> 3. Allow local-scoped allocation, e.g. inside a `with` statement
>
> But since there was not much fun in these, I thought it would be nice if we
> could exploit `ctypes` callback functions, to allow developers hook into
> such routines natively (e.g. for debugging/monitoring), or even write them
> entirely in Python (of course there has to be an underlying memory
> allocation API).
>
> For example, the idea is to be able to define a page-aligned allocator in
> ~30 lines of Python code, like that:
>
>
> https://github.com/inaccel/numpy-allocator/blob/master/test/aligned_allocator.py
>
> ---
>
> While experimenting with this project I spotted the two following issues:
>
> 1. Thread-locality
> My biggest concern is the global scope of the numpy `current_allocator`
> variable. Currently, an allocator change is applied globally affecting
> every
> thread. This behavior breaks the local-scoped allocation promise of my
> project. Imagine for example the implications of allocating pinned
> (page-locked) memory (since you mention this use-case a lot) for random
> glue-code ndarrays in background threads.
>
> 2. Allocator context (already discussed)
> I found a bug, when I tried to use a Python callback (`ctypes.CFUNCTION`)
> for the `PyDataMem_FreeFunc` routine. Since there are cases in which the
> `free` routine is invoked after a PyErr has occurred (to clean up internal
> arrays for example), `ctypes` messes with the exception state badly. This
> problem can be resolved with the the use of a `ctx` (allocator context)
> that
> will allow the routines to run clean of errors, wrapping them like that:
>
> ```
> static void wrapped_free(void *ptr, size_t size, void *ctx) {
>         PyObject *type;
>         PyObject *value;
>         PyObject *traceback;
>         PyErr_Fetch(&type, &value, &traceback);
>         ((PyDataMem_Context *) ctx)->free(ptr, size);
>         PyErr_Restore(type, value, traceback);
> }
> ```
>
> Note: This bug doesn't affect `CDLL` members (CFuncPtr objects), since they
> are pure `dlsym` pointers.
>
> Of course, this is a simple case of how a `ctx` could be useful for an
> allocation policy. I guess people can become very creative with this in
> general.
>
> Elias
>
>
>
>
> --
> Sent from: http://numpy-discussion.10968.n7.nabble.com/
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210515/e9311687/attachment.html>


More information about the NumPy-Discussion mailing list