[Numpy-discussion] defining a NumPy API standard?

Ralf Gommers ralf.gommers at gmail.com
Mon Jul 15 18:54:54 EDT 2019


On Sat, Jul 13, 2019 at 12:48 AM Mark Mikofski <mikofski at berkeley.edu>
wrote:

> This slide deck from Matthew Rocklin at SciPy 2019 might be relevant:
> https://matthewrocklin.com/slides/scipy-2019#/
>

That was a very nice talk indeed. It's also up on Youtube, worth watching:
https://www.youtube.com/watch?v=Q0DsdiY-jiw

I've also put up an 0.0.1 version of RNumPy "restricted NumPy" up on PyPI
(mostly to reserve the name, but it's usable). The README and __init__.py
docstring plus the package itself (https://github.com/Quansight-Labs/rnumpy)
should give a better idea of the ideas we were discussing in this thread.

Cheers,
Ralf




> On Tue, Jun 4, 2019 at 12:06 AM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Mon, Jun 3, 2019 at 7:56 PM Sebastian Berg <sebastian at sipsolutions.net>
>> wrote:
>>
>>> On Sun, 2019-06-02 at 08:42 +0200, Ralf Gommers wrote:
>>> >
>>> >
>>> <snip>
>>> > > >
>>> > >
>>> > > This sounds like a restructuring or factorization of the API, in
>>> > > order to make it smaller, and thus easier to learn and use.
>>> > > It may start with the docs, by paying more attention to the "core"
>>> > > or important functions and methods, and noting the deprecated, or
>>> > > not frequently used, or not important functions. This could also
>>> > > help the satellite projects, which use NumPy API as an example, and
>>> > > may also be influenced by them and their decisions.
>>> > >
>>> >
>>> >  Indeed. It will help restructure our docs. Perhaps not the reference
>>> > guide (not sure yet), but definitely the user guide and other high-
>>> > level docs we (or third parties) may want to create.
>>> >
>>>
>>> Trying to follow the discussion, there seems to be various ideas? Do I
>>> understand it right that the original proposal was much like doing a
>>> list of:
>>>
>>>   * np.ndarray.cumprod: low importance -> prefer np.multiply.accumulate
>>>   * np.ravel_multi_index: low importance, but distinct feature
>>>
>>
>> Indeed. Certainly no more than that was my idea.
>>
>>
>>> Maybe with added groups such as "transpose-like" and "reshape-like"
>>> functions?
>>> This would be based on 1. "Experience" and 2. usage statistics. This
>>> seems mostly a task for 2-3 people to then throw out there for
>>> discussion.
>>> There will be some very difficult/impossible calls, since in the end
>>> Nathaniel is right, we do not quite know the question we want to
>>> answer. But for a huge part of the API it may not be problematic?
>>>
>>
>> Agreed, won't be problematic.
>>
>>
>>>
>>> Then there is an idea of providing better mixins (and tests).
>>> This could be made easier by the first idea, for prioritization.
>>> Although, the first idea is probably not really necessary to kick this
>>> off at all. The interesting parts to me seem likely how to best solve
>>> testing of the mixins and numpy-api-duplicators in general.
>>>
>>> Implementing a growing set of mixin seems likely fairly straight
>>> forwrad (although maybe much easier to approach if there is a list from
>>> the first project)?
>>
>>
>> Indeed. I think there's actually 3 levels here (at least):
>> 1. function name: high/low importance or some such simple classification
>> 2. function signature and behavior: is the behavior optimal, what would
>> be change, etc.
>> 3. making duck arrays and subclasses that rely on all those functions and
>> their behavior easier to implemement/use
>>
>> Mixins are a specific answer to (3). And it's unclear if they're the best
>> answer (could be, I don't know - please don't start a discussion on that
>> here). Either way, working on (3) will be helped by having a better sense
>> of (1) and (2).
>>
>> Also think about effort: (2) is at least an order of magnitude more work
>> than (1), and (3) likely even more work than (2).
>>
>>
>>> And, once we have a start, maybe we can rely on the
>>> array-like implementors to be the main developers (limiting us mostly
>>> to review).
>>>
>>>
>>> The last part would be probably for users and consumers of array-likes.
>>> This largely overlaps, but comes closer to the problem of "standard".
>>> If we have a list of functions that we tend to see as more or less
>>> important, it may be interesting for downstream projects to restrict
>>> themselves to simplify interoperability e.g. with dask.
>>>
>>> Maybe we do not have to draw a strict line though? How plausible would
>>> it be to set up a list (best auto-updating) saying nothing but:
>>>
>>> `np.concatenate` supported by: dask, jax, cupy
>>>
>>
>> That's probably not that hard, and I agree it would be quite useful. The
>> namespaces of each of those libraries is probably not the same, but with
>> dir() and some strings and lists you'll get a long way here I think.
>>
>>
>>>
>>> I am not sure if this is helpful, but it feels to me that the first
>>> part is what Ralf was thinking of? Just to kick of such a a "living
>>> document".
>>
>>
>> Indeed.
>>
>> I could maybe help with providing the second pair of eyes
>>> for a first iteration there, Ralf.
>>
>>
>> Awesome, thanks Sebastian.
>>
>> Cheers,
>> Ralf
>>
>>
>> The last list I would actually find
>>> interesting myself, but not sure how easy it would be to approach it?
>>>
>>> Best,
>>>
>>> Sebastian
>>>
>>>
>>> > Ralf
>>> > _______________________________________________
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion at python.org
>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
>
> --
> Mark Mikofski, PhD (2005)
> *Fiat Lux*
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190715/0df1be92/attachment.html>


More information about the NumPy-Discussion mailing list