[Numpy-discussion] NEP 31 — Context-local and global overrides of the NumPy API

Ralf Gommers ralf.gommers at gmail.com
Tue Sep 3 02:20:36 EDT 2019


On Mon, Sep 2, 2019 at 2:09 PM Nathaniel Smith <njs at pobox.com> wrote:

> On Mon, Sep 2, 2019 at 2:15 AM Hameer Abbasi <einstein.edison at gmail.com>
> wrote:
> > Me, Ralf Gommers and Peter Bell (both cc’d) have come up with a proposal
> on how to solve the array creation and duck array problems. The solution is
> outlined in NEP-31, currently in the form of a PR, [1]
>
> Thanks for putting this together! It'd be great to have more
> engagement between uarray and numpy.
>
> > ============================================================
> >
> > NEP 31 — Context-local and global overrides of the NumPy API
> >
> > ============================================================
>
> Now that I've read this over, my main feedback is that right now it
> seems too vague and high-level to give it a fair evaluation? The idea
> of a NEP is to lay out a problem and proposed solution in enough
> detail that it can be evaluated and critiqued, but this felt to me
> more like it was pointing at some other documents for all the details
> and then promising that uarray has solutions for all our problems.
>

This is fair enough I think. We'll need to put some more thought in where
to refer to other NEPs, and where to be more concrete.


> > This NEP takes a more holistic approach: It assumes that there are parts
> of the API that need to be
> > overridable, and that these will grow over time. It provides a general
> framework and a mechanism to
> > avoid a design of a new protocol each time this is required.
>
> The idea of a holistic approach makes me nervous, because I'm not sure
> we have holistic problems. Sometimes a holistic approach is the right
> thing; other times it means sweeping the actual problems under the
> rug, so things *look* simple and clean but in fact nothing has been
> solved, and they just end up biting us later. And from the NEP as
> currently written, I can't tell whether this is the good kind of
> holistic or the bad kind of holistic.
>

> Now I'm writing vague handwavey things, so let me follow my own advice
> and make it more concrete with an example :-).
>
> When Stephan and I were writing NEP 22, the single thing we spent the
> most time discussing was the problem of duck-array coercion, and in
> particular what to do about existing code that does
> np.asarray(duck_array_obj).
>
> The reason this is challenging is that there's a lot of code written
> in Cython/C/C++ that calls np.asarray,


Cython code only perhaps? It would surprise me if there's a lot of C/C++
code that explicitly calls into our Python rather than C API.

and then blindly casts the
> return value to a PyArray struct and starts accessing the raw memory
> fields. If np.asarray starts returning anything besides a real-actual
> np.ndarray object, then this code will start corrupting random memory,
> leading to a segfault at best.
>
> Stephan felt strongly that this meant that existing np.asarray calls
> *must not* ever return anything besides an np.ndarray object, and
> therefore we needed to add a new function np.asduckarray(), or maybe
> an explicit opt-in flag like np.asarray(..., allow_duck_array=True).
>
> I agreed that this was a problem, but thought we might be able to get
> away with an "opt-out" system, where we add an allow_duck_array= flag,
> but make it *default* to True, and document that the Cython/C/C++
> users who want to work with a raw np.ndarray object should modify
> their code to explicitly call np.asarray(obj, allow_duck_array=False).
> This would mean that for a while people who tried to pass duck-arrays
> into legacy library would get segfaults, but there would be a clear
> path for fixing these issues as they were discovered.
>
> Either way, there are also some other details to figure out: how does
> this affect the C version of asarray? What about np.asfortranarray –
> probably that should default to allow_duck_array=False, even if we did
> make np.asarray default to allow_duck_array=True, right?
>
> Now if I understand right, your proposal would be to make it so any
> code in any package could arbitrarily change the behavior of
> np.asarray for all inputs, e.g. I could just decide that
> np.asarray([1, 2, 3]) should return some arbitrary non-np.ndarray
> object.


No, definitely not! It's all opt-in, by explicitly importing from
`numpy.overridable` or `unumpy`. No behavior of anything in the existing
numpy namespaces should be affected in any way.

I agree with the concerns below, hence it should stay opt-in.

Cheers,
Ralf

It seems like this has a much greater potential for breaking
> existing Cython/C/C++ code, and the NEP doesn't currently describe why
> this extra power is useful, and it doesn't currently describe how it
> plans to mitigate the downsides. (For example, if a caller needs a
> real np.ndarray, then is there some way to explicitly request one? The
> NEP doesn't say.) Maybe this is all fine and there are solutions to
> these issues, but any proposal to address duck array coercion needs to
> at least talk about these issues!
>
> And that's just one example... array coercion is a particularly
> central and tricky problem, but the numpy API big, and there are
> probably other problems like this. For another example, I don't
> understand what the NEP is proposing to do about dtypes at all.
>
> That's why I think the NEP needs to be fleshed out a lot more before
> it will be possible to evaluate fairly.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190902/ae63b0cf/attachment-0001.html>


More information about the NumPy-Discussion mailing list