[Numpy-discussion] Starting work on ufunc rewrite

Nathaniel Smith njs at pobox.com
Sat Apr 2 21:12:03 EDT 2016

On Thu, Mar 31, 2016 at 1:00 PM, Jaime Fernández del Río
<jaime.frio at gmail.com> wrote:
> I have started discussing with Nathaniel the implementation of the ufunc ABI
> break that he proposed in a draft NEP a few months ago:
> http://thread.gmane.org/gmane.comp.python.numeric.general/61270
> His original proposal was to make the public portion of PyUFuncObject be:
>     typedef struct {
>         PyObject_HEAD
>         int nin, nout, nargs;
>     } PyUFuncObject;
> Of course the idea is that internally we would use a much larger struct that
> we could change at will, as long as its first few entries matched those of
> PyUFuncObject. My problem with this, and I may very well be missing
> something, is that in PyUFunc_Type we need to set the tp_basicsize to the
> size of the extended struct, so we would end up having to expose its
> contents.

How so? tp_basicsize tells you the size of the real struct, but that
doesn't let you actually access any of its fields. Unless you decide
to start cheating and reaching into random bits of memory by hand,
but, well, this is C, we can't really prevent that :-).

> This is somewhat similar to what now happens with PyArrayObject:
> anyone can #include "ndarraytypes.h", cast PyArrayObject* to
> PyArrayObjectFields*, and access the guts of the struct without using the
> supplied API inline functions. Not the end of the world, but if you want to
> make something private, you might as well make it truly private.

Yeah, there is also an issue here where we don't always do a great job
of separating our internal headers from our public headers. But that's
orthogonal -- any solution for hiding PyUFunc's internals will require
handling that somehow.

> I think it would be to have something similar to what NpyIter does::
>     typedef struct {
>         PyObject_HEAD
>         NpyUFunc *ufunc;
>     } PyUFuncObject;

A few points:

We have to leave nin, nout, nargs where they are in PyUFuncObject,
because there code out there that accesses them.

This technique is usually used when you want to allow subclassing of a
struct, while also allowing you to add fields later without breaking
ABI. We don't want to allow subclassing of PyUFunc (regardless of what
happens here -- subclassing just creates tons of problems), so AFAICT
it isn't really necessary. It adds a bit of extra complexity (two
allocations instead of one, extra pointer chasing, etc.), though to be
fair the hidden struct approach also adds some complexity (you have to
cast to the internal type), so it's not a huge deal either way.

If the NpyUFunc pointer field is public then in principle people could
refer to it and create problems down the line in case we ever decided
to switch to a different strategy... not very likely given that it'd
just be a meaningless opaque pointer, but mentioning for
completeness's sake.

> where NpyUFunc would, at this level, be an opaque type of which nothing
> would be known. We could have some of the NpyUFunc attributes cached on the
> PyUFuncObject struct for easier access, as is done in NewNpyArrayIterObject.

Caching sounds like *way* more complexity than we want :-). As soon as
you have two copies of data then they can get out of sync...

> This would also give us more liberty in making NpyUFunc be whatever we want
> it to be, including a variable-sized memory chunk that we could use and
> access at will.

Python objects are allowed to be variable size: tp_basicsize is the
minimum size. Built-ins like lists and strings have variable size

> NpyIter is again a good example, where rather than storing
> pointers to strides and dimensions arrays, these are made part of the
> NpyIter memory chunk, effectively being equivalent to having variable sized
> arrays as part of the struct. And I think we will probably no longer trigger
> the Cython warnings about size changes either.
> Any thoughts on this approach? Is there anything fundamentally wrong with
> what I'm proposing here?

Modulo the issue with nin/nout/nargs, I don't think it makes a huge
difference either way. I don't see any compelling advantages to your
proposal given our particular situation, but it doesn't make a huge
difference either way. Maybe I'm missing something.

> Also, this is probably going to end up being a rewrite of a pretty large and
> complex codebase. I am not sure that working on this on my own and
> eventually sending a humongous PR is the best approach. Any thoughts on how
> best to handle turning this into a collaborative, incremental effort? Anyone
> who would like to join in the fun?

I'd strongly recommend breaking it up into individually mergeable
pieces to the absolute maximum extent possible, and merging them back
as we go, so that we never have a giant branch diverging from master.
(E.g., refactor a few functions -> submit a PR -> merge, refactor some
more -> merge, add a new feature enabled by the refactoring -> merge,
repeat). There are limits to how far you can take this, e.g. the PR
for just hiding the current API + adding back the public API pieces
that Numba needs will itself be not quite trivial even if we do no
refactoring yet, and until we get more of an outline for where we're
trying to get to it will be hard to tell how to break it into pieces
:-). But once things are hidden it should be possible to do quite a
bit of internal rearranging incrementally on master, I hope?

For coordinating this though it would probably be good to start
working on some public notes (gdocs or the wiki or something) where we
sketch out some overall plan, make a plan of attack for how to break
it up, etc., and maybe have some higher-bandwidth conversations to
make that outline (google hangout?).


Nathaniel J. Smith -- https://vorpus.org

More information about the NumPy-Discussion mailing list