On Sun, Jun 2, 2019 at 12:33 AM Dashamir Hoxha <dashohoxha@gmail.com> wrote:
On Sat, Jun 1, 2019 at 10:05 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:

I think this is potentially useful, but *far* more prescriptive and detailed than I had in mind. Both you and Nathaniel seem to have not understood what I mean by "out of scope", so I think that's my fault in not being explicit enough. I *do not* want to prescribe behavior. Instead, a simple yes/no for each function in numpy and method on ndarray.

Our API is huge. A simple count:
main namespace: 600
fft: 30
linalg: 30
random: 60
ndarray: 70
lib: 20
lib.npyio: 35
etc. (many more ill-thought out but not clearly private submodules)

Just the main namespace plus ndarray methods is close to 700 objects. If you want to build a NumPy-like thing, that's 700 decisions to make. I'm suggesting something as simple as a list of functions that constitute a sensible "core of NumPy". That list would not include anything in fft/linalg/random, since those can easily be separated out (indeed, if we could disappear fft and linalg and just rely on scipy, pyfftw etc., that would be great). It would not include financial functions. And so on. I guess we'd end up with most ndarray methods plus <150 functions.

That list could be used for many purposes: improve the docs, serve as the set of functions to implement in xnd.array, unumpy & co, Marten's suggestion of implementing other functions in terms of basic functions, etc.

Two other thoughts:
1. NumPy is not done. Our thinking on how to evolve the NumPy API is fairly muddled. When new functions are proposed, it's decided on on a case-by-case basis, usually without a guiding principle. We need to improve that. A "core of NumPy" list could be a part of that puzzle.
2. We often argue about deprecations. Deprecations are costly, but so is keeping around functions that are not very useful or have a poor design. This may offer a middle ground. Don't let others repeat our mistakes, signal to users that a function is of questionable value, without breaking already written code.

This sounds like a restructuring or factorization of the API, in order to make it smaller, and thus easier to learn and use.
It may start with the docs, by paying more attention to the "core" or important functions and methods, and noting the deprecated, or not frequently used, or not important functions. This could also help the satellite projects, which use NumPy API as an example, and may also be influenced by them and their decisions.

 Indeed. It will help restructure our docs. Perhaps not the reference guide (not sure yet), but definitely the user guide and other high-level docs we (or third parties) may want to create.

Ralf