[Numpy-discussion] Starting work on ufunc rewrite

Jaime Fernández del Río jaime.frio at gmail.com
Fri Apr 1 16:04:24 EDT 2016


On Thu, Mar 31, 2016 at 10:14 PM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> There is certainly good precedent for the approach you suggest.
> Shortly after Nathaniel mentioned the rewrite to me, I looked up
> d-pointers as a possible technique: https://wiki.qt.io/D-Pointer.
>

Yes, the idea is similar, although somewhat simpler since we are doing C,
not C++.


>
> If we allow arbitrary kwargs for the new functions, is that something
> you would want to note in the public structure? I was thinking
> something along the lines of adding a hook to process additional
> kwargs and return a void * that would then be passed to the loop.
>

I'm not sure I understand what you mean... But I also don't think it is
very relevant at this point? What I intend to do is simply to hide the guts
of ufuncs, breaking everyone's code once... so that we can later change
whatever we want without breaking anything else. PyUFunc_GenericFunction
already takes *args and **kwargs, and the internal logic of how these get
processed can be modified at will. If what you are proposing is to create a
PyUFunc_FromFuncAndDataAndSignatureAndKwargProcessor API function that
would provide a customized function to process extra kwargs and somehow
pass them into the actual ufunc loop, that would just be an API extension,
and there shouldn't be any major problem in introducing that whenever,
especially once we are free to modify the internal representation of ufuncs
without breaking ABI compatibility.


> To do this incrementally, perhaps opening a special development branch
> on the main repository is in order?
>

Yes, something like that seems like the right thing to do indeed. I would
like someone with more git foo than me to spell out the details of how we
would create and eventually merge that branch.


>
> I would love to join in the fun as time permits. Unfortunately, it is
> not especially permissive right about now. I will at least throw in
> some ideas that I have been mulling over.
>

Please do!

Jaime


>
>     -Joe
>
>
> On Thu, Mar 31, 2016 at 4:00 PM, Jaime Fernández del Río
> <jaime.frio at gmail.com> wrote:
> > I have started discussing with Nathaniel the implementation of the ufunc
> ABI
> > break that he proposed in a draft NEP a few months ago:
> >
> > http://thread.gmane.org/gmane.comp.python.numeric.general/61270
> >
> > His original proposal was to make the public portion of PyUFuncObject be:
> >
> >     typedef struct {
> >         PyObject_HEAD
> >         int nin, nout, nargs;
> >     } PyUFuncObject;
> >
> > Of course the idea is that internally we would use a much larger struct
> that
> > we could change at will, as long as its first few entries matched those
> of
> > PyUFuncObject. My problem with this, and I may very well be missing
> > something, is that in PyUFunc_Type we need to set the tp_basicsize to the
> > size of the extended struct, so we would end up having to expose its
> > contents. This is somewhat similar to what now happens with
> PyArrayObject:
> > anyone can #include "ndarraytypes.h", cast PyArrayObject* to
> > PyArrayObjectFields*, and access the guts of the struct without using the
> > supplied API inline functions. Not the end of the world, but if you want
> to
> > make something private, you might as well make it truly private.
> >
> > I think it would be to have something similar to what NpyIter does::
> >
> >     typedef struct {
> >         PyObject_HEAD
> >         NpyUFunc *ufunc;
> >     } PyUFuncObject;
> >
> > where NpyUFunc would, at this level, be an opaque type of which nothing
> > would be known. We could have some of the NpyUFunc attributes cached on
> the
> > PyUFuncObject struct for easier access, as is done in
> NewNpyArrayIterObject.
> > This would also give us more liberty in making NpyUFunc be whatever we
> want
> > it to be, including a variable-sized memory chunk that we could use and
> > access at will. NpyIter is again a good example, where rather than
> storing
> > pointers to strides and dimensions arrays, these are made part of the
> > NpyIter memory chunk, effectively being equivalent to having variable
> sized
> > arrays as part of the struct. And I think we will probably no longer
> trigger
> > the Cython warnings about size changes either.
> >
> > Any thoughts on this approach? Is there anything fundamentally wrong with
> > what I'm proposing here?
> >
> > Also, this is probably going to end up being a rewrite of a pretty large
> and
> > complex codebase. I am not sure that working on this on my own and
> > eventually sending a humongous PR is the best approach. Any thoughts on
> how
> > best to handle turning this into a collaborative, incremental effort?
> Anyone
> > who would like to join in the fun?
> >
> > Jaime
> >
> > --
> > (\__/)
> > ( O.o)
> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
> planes de
> > dominación mundial.
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160401/983f5d06/attachment.html>


More information about the NumPy-Discussion mailing list