[Numpy-discussion] defining a NumPy API standard?

Sat Jun 1 18:34:38 EDT 2019

On Sat, Jun 1, 2019 at 1:05 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:
> I think this is potentially useful, but *far* more prescriptive and detailed than I had in mind. Both you and Nathaniel seem to have not understood what I mean by "out of scope", so I think that's my fault in not being explicit enough. I *do not* want to prescribe behavior. Instead, a simple yes/no for each function in numpy and method on ndarray.

So yes/no are the answers. But what's the question?

"If we were redesigning numpy in a fantasy world without external
constraints or compatibility issues, would we include this function?"
"Is this function well designed?"
"Do we think that supporting this function is necessary to achieve
practical duck-array compatibility?"
"If someone implements this function, should we give them a 'numpy
core compliant!' logo to put on their website?"
"Do we recommend that people use this function in new code?"
"If we were trying to design a minimal set of primitives and implement
the rest of numpy in terms of them, then is this function a good
candidate for a primitive?"

These are all really different things, and useful for solving
different problems... I feel like you might be lumping them together
some?

Also, I'm guessing there are a bunch of functions where you think part
of the interface is fine and part of the interface is broken. (E.g.
dot's behavior on high-dimensional arrays.) Do you think this "one
bool per function" structure will be fine-grained enough for what you
want to do?

> Two other thoughts:
> 1. NumPy is not done. Our thinking on how to evolve the NumPy API is fairly muddled. When new functions are proposed, it's decided on on a case-by-case basis, usually without a guiding principle. We need to improve that. A "core of NumPy" list could be a part of that puzzle.

I think we do have some rough consensus principles on what's in scope
and what isn't in scope for numpy, but yeah, articulating them more
clearly could be useful. Stuff like "output types and shape should be
predictable from input types and shape", "numpy's core
responsibilities are the array/dtype/ufunc interfaces, and providing a
lingua franca for python numerical libraries to interoperate" (and
therefore: "if it can live outside numpy it probably should"), etc.
I'm seeing this as a living document (a NEP?) that tries to capture
some rules of thumb and that we update as we go. That seems pretty
different to me than a long list of yes/no checkboxes though?

> 2. We often argue about deprecations. Deprecations are costly, but so is keeping around functions that are not very useful or have a poor design. This may offer a middle ground. Don't let others repeat our mistakes, signal to users that a function is of questionable value, without breaking already written code.

The idea has come up a few times of having a "soft deprecation" level,
where we put a warning in the docs but not in the code. It seems like
a reasonable idea to me. It's inherently a kind of case-by-case thing
that can be done incrementally. But, if someone wants to
systematically work through all the docs and do the case-by-case
analysis, that also seems like a reasonable idea to me. I'm not sure
if that's the same as your proposal or not.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org