
On Sun, Jun 2, 2019 at 12:35 AM Nathaniel Smith <njs@pobox.com> wrote:
On Sat, Jun 1, 2019 at 1:05 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
I think this is potentially useful, but *far* more prescriptive and detailed than I had in mind. Both you and Nathaniel seem to have not understood what I mean by "out of scope", so I think that's my fault in not being explicit enough. I *do not* want to prescribe behavior. Instead, a simple yes/no for each function in numpy and method on ndarray.
So yes/no are the answers. But what's the question?
"If we were redesigning numpy in a fantasy world without external constraints or compatibility issues, would we include this function?" "Is this function well designed?" "Do we think that supporting this function is necessary to achieve practical duck-array compatibility?" "If someone implements this function, should we give them a 'numpy core compliant!' logo to put on their website?" "Do we recommend that people use this function in new code?" "If we were trying to design a minimal set of primitives and implement the rest of numpy in terms of them, then is this function a good candidate for a primitive?"
These are all really different things, and useful for solving different problems... I feel like you might be lumping them together some?
No, I feel like you just want to see a real proposal. At this point I've gotten some really useful feedback, in particular from Marten (thanks!), and I have a better idea of what to do. So I'll answer a few of your questions, and propose to leave the rest till I actually have some more solid to discuss. That will likely answer many of your questions.
Also, I'm guessing there are a bunch of functions where you think part of the interface is fine and part of the interface is broken. (E.g. dot's behavior on high-dimensional arrays.)
Indeed, but that's a much harder problem to tackle. Again, there's a reason I put function behavior explicitly out of scope. Do you think this "one
bool per function" structure will be fine-grained enough for what you want to do?
yes
Two other thoughts: 1. NumPy is not done. Our thinking on how to evolve the NumPy API is fairly muddled. When new functions are proposed, it's decided on on a case-by-case basis, usually without a guiding principle. We need to improve that. A "core of NumPy" list could be a part of that puzzle.
I think we do have some rough consensus principles on what's in scope and what isn't in scope for numpy,
Very rough perhaps. I don't think we are on the same wavelength at all about the cost of adding new functions, the cost of deprecations, the use of submodules and even what's public or private right now. That can't be solved all at once, but I think what my idea will help with some of these. but yeah, articulating them more
clearly could be useful. Stuff like "output types and shape should be predictable from input types and shape", "numpy's core responsibilities are the array/dtype/ufunc interfaces, and providing a lingua franca for python numerical libraries to interoperate" (and therefore: "if it can live outside numpy it probably should"), etc.
All of these are valid questions. Most of that propably needs to be in the scope document (https://www.numpy.org/neps/scope.html). Which also needs to be improved. I'm seeing this as a living document (a NEP?) NEP would work. Although I'd prefer a way to be able to reference some fixed version of it rather than it being always in flux. that tries to capture
some rules of thumb and that we update as we go. That seems pretty different to me than a long list of yes/no checkboxes though?
2. We often argue about deprecations. Deprecations are costly, but so is keeping around functions that are not very useful or have a poor design. This may offer a middle ground. Don't let others repeat our mistakes, signal to users that a function is of questionable value, without breaking already written code.
The idea has come up a few times of having a "soft deprecation" level, where we put a warning in the docs but not in the code. It seems like a reasonable idea to me. It's inherently a kind of case-by-case thing that can be done incrementally. But, if someone wants to systematically work through all the docs and do the case-by-case analysis, that also seems like a reasonable idea to me. I'm not sure if that's the same as your proposal or not.
not the same, but related. Ralf