Re: [Numpy-discussion] defining a NumPy API standard?

June 2, 2019

      On Sun, Jun 2, 2019 at 12:35 AM Nathaniel Smith <njs@pobox.com> wrote:
...
On Sat, Jun 1, 2019 at 1:05 PM Ralf Gommers <ralf.gommers@gmail.com>
wrote:
...
I think this is potentially useful, but *far* more prescriptive and
detailed than I had in mind. Both you and Nathaniel seem to have not
understood what I mean by "out of scope", so I think that's my fault in not
being explicit enough. I *do not* want to prescribe behavior. Instead, a
simple yes/no for each function in numpy and method on ndarray.
So yes/no are the answers. But what's the question?
"If we were redesigning numpy in a fantasy world without external
constraints or compatibility issues, would we include this function?"
"Is this function well designed?"
"Do we think that supporting this function is necessary to achieve
practical duck-array compatibility?"
"If someone implements this function, should we give them a 'numpy
core compliant!' logo to put on their website?"
"Do we recommend that people use this function in new code?"
"If we were trying to design a minimal set of primitives and implement
the rest of numpy in terms of them, then is this function a good
candidate for a primitive?"
These are all really different things, and useful for solving
different problems... I feel like you might be lumping them together
some?
No, I feel like you just want to see a real proposal. At this point I've
gotten some really useful feedback, in particular from Marten (thanks!),
and I have a better idea of what to do. So I'll answer a few of your
questions, and propose to leave the rest till I actually have some more
solid to discuss. That will likely answer many of your questions.
...
Also, I'm guessing there are a bunch of functions where you think part
of the interface is fine and part of the interface is broken. (E.g.
dot's behavior on high-dimensional arrays.)
Indeed, but that's a much harder problem to tackle. Again, there's a reason
I put function behavior explicitly out of scope.

Do you think this "one
...
bool per function" structure will be fine-grained enough for what you
want to do?
yes
...
...
Two other thoughts:
1. NumPy is not done. Our thinking on how to evolve the NumPy API is
fairly muddled. When new functions are proposed, it's decided on on a
case-by-case basis, usually without a guiding principle. We need to improve
that. A "core of NumPy" list could be a part of that puzzle.
I think we do have some rough consensus principles on what's in scope
and what isn't in scope for numpy,
Very rough perhaps. I don't think we are on the same wavelength at all
about the cost of adding new functions, the cost of deprecations, the use
of submodules and even what's public or private right now.

That can't be solved all at once, but I think what my idea will help with
some of these.

but yeah, articulating them more
...
clearly could be useful. Stuff like "output types and shape should be
predictable from input types and shape", "numpy's core
responsibilities are the array/dtype/ufunc interfaces, and providing a
lingua franca for python numerical libraries to interoperate" (and
therefore: "if it can live outside numpy it probably should"), etc.
All of these are valid questions. Most of that propably needs to be in the
scope document (https://www.numpy.org/neps/scope.html). Which also needs to
be improved.

I'm seeing this as a living document (a NEP?)

NEP would work. Although I'd prefer a way to be able to reference some
fixed version of it rather than it being always in flux.

that tries to capture
...
some rules of thumb and that we update as we go. That seems pretty
different to me than a long list of yes/no checkboxes though?
...
2. We often argue about deprecations. Deprecations are costly, but so is
keeping around functions that are not very useful or have a poor design.
This may offer a middle ground. Don't let others repeat our mistakes,
signal to users that a function is of questionable value, without breaking
already written code.
The idea has come up a few times of having a "soft deprecation" level,
where we put a warning in the docs but not in the code. It seems like
a reasonable idea to me. It's inherently a kind of case-by-case thing
that can be done incrementally. But, if someone wants to
systematically work through all the docs and do the case-by-case
analysis, that also seems like a reasonable idea to me. I'm not sure
if that's the same as your proposal or not.
not the same, but related.

Ralf