Cleanup of the `np.lib` namespace (was: ENH: Proposal to add atleast_nd function)

On Thu, 2021-02-18 at 11:10 +0100, Ralf Gommers wrote:
On Wed, Feb 17, 2021 at 9:26 PM Oscar Benjamin <
<snip>
It isn't, but it's relatively straightforward and can be done without thinking about the issues around our other namespaces. Basically: - today `numpy.lib` is a public but fairly useless namespace, because its contents get star-imported to the main namespace; only subsubmodules like `numpy.lib.stride_tricks` are separate - we want to stop this star-importing, which required some tedious work of fixing up how we handle __all__ dicts in addition to making exports explicit - then, we would like to use `numpy.lib` as a namespace for utilities and assorted functionality that people seem to want, but does not meet the bar
Initilaly, I did not like the idea of using `np.lib.xyz.function` a lot (the `lib` feels a bit unnecessary, compared to fft, etc.), but I am starting to warm up to it more. At least it is at least easy to remember.
for the main namespace and doesn't fit in our other decent namespace (fft, linalg, random, polynomial, f2py, distutils). - TBD if there should be subsubmodules under `numpy.lib` or not
I think we should discuss this part :). I was thinking that we should probably aim for _only_ having (or showing) submodules in `lib`? In particular, I was aiming for hiding all names in `np.lib` that are also part of the main namespace (not necessarily deprecating, thanks to Python 3.7+ magic). The reason is, that I would like to be able to find the "interesting" submodules with tab completion and I won't be able to do that, if I have a wall of random functions. That would leave us with the _current_ "submodules": * mixins * scimath (same as np.emath though) And the ones which are mostly fully exposed to the main namespace: * type_check * index_tricks * function_base * nanfunctions * shape_base * stride_tricks * twodim_base * ufunclike * histograms * polynomial (not to be confused with numpy.polynomial) * utils * arraysetops * npyio (includes some additional rarely used funcs) * arraypad * recfunctions (needs to be explicitly imported currently) Looking at those, there are a few I would consider hiding completely (since they don't add anything, and the name doesn't feel great): * function_base * ufunlike * type_check * twodim_base * arraypad * utils * polynomial (due to the name confusion with `np.polynomial`) While I am probably OK with keeping the others around, although for some the namespace won't be well groomed (e.g. there is another `shape_base` in `numpy.core`. Others, like `arraysetops` are fully exported to the main namespace, but represent complete topical groupings, so I am less sure what we should aim for there. Cheers, Sebastian
- it should be explicitly documented that this is a "lower bar namespace" and that we discourage other array/tensor libraries from copying its API
We had a good discussion about this in the community meeting yesterday. Sebastian volunteered to sort out the star-import issue.
I've been thinking about something possibly similar for sympy which also has a bloated top-level namespace (and has no other place for public API to go).
A larger plan for cleaning up main namespace bloat, as well as dealing with our unmaintained namespaces (numpy.dual, numpy.emath, etc.) is still needed.
Cheers, Ralf _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion

On Fri, Feb 19, 2021 at 5:25 PM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Thu, 2021-02-18 at 11:10 +0100, Ralf Gommers wrote:
On Wed, Feb 17, 2021 at 9:26 PM Oscar Benjamin <
<snip>
It isn't, but it's relatively straightforward and can be done without thinking about the issues around our other namespaces. Basically: - today `numpy.lib` is a public but fairly useless namespace, because its contents get star-imported to the main namespace; only subsubmodules like `numpy.lib.stride_tricks` are separate - we want to stop this star-importing, which required some tedious work of fixing up how we handle __all__ dicts in addition to making exports explicit - then, we would like to use `numpy.lib` as a namespace for utilities and assorted functionality that people seem to want, but does not meet the bar
Initilaly, I did not like the idea of using `np.lib.xyz.function` a lot (the `lib` feels a bit unnecessary, compared to fft, etc.), but I am starting to warm up to it more. At least it is at least easy to remember.
The `lib` feels quite necessary to me - it's either cordoning it off nicely, or moving it to another package completely I'd say. Otherwise we're again mixing the few high-quality and well-defined `np.xxx` namespaces with more random stuff.
for the main namespace and doesn't fit in our other decent namespace (fft, linalg, random, polynomial, f2py, distutils). - TBD if there should be subsubmodules under `numpy.lib` or not
I think we should discuss this part :). I was thinking that we should probably aim for _only_ having (or showing) submodules in `lib`?
In particular, I was aiming for hiding all names in `np.lib` that are also part of the main namespace (not necessarily deprecating, thanks to Python 3.7+ magic). The reason is, that I would like to be able to find the "interesting" submodules with tab completion and I won't be able to do that, if I have a wall of random functions.
That sounds like a good idea (the hiding, not deprecating).
That would leave us with the _current_ "submodules":
* mixins
* scimath (same as np.emath though)
Mixins seems good to keep, assuming there's a plan to add some more mixins. A namespace with a single object in it is a bit weird. scimath is duplicate with emath, so should get an underscore.
And the ones which are mostly fully exposed to the main namespace:
* type_check * index_tricks * function_base * nanfunctions * shape_base * stride_tricks * twodim_base * ufunclike * histograms
* polynomial (not to be confused with numpy.polynomial) * utils * arraysetops * npyio (includes some additional rarely used funcs) * arraypad * recfunctions (needs to be explicitly imported currently)
Looking at those, there are a few I would consider hiding completely (since they don't add anything, and the name doesn't feel great):
* function_base * ufunlike * type_check * twodim_base * arraypad * utils * polynomial (due to the name confusion with `np.polynomial`)
While I am probably OK with keeping the others around, although for some the namespace won't be well groomed (e.g. there is another `shape_base` in `numpy.core`. Others, like `arraysetops` are fully exported to the main namespace, but represent complete topical groupings, so I am less sure what we should aim for there.
For all the ones that get imported into the main namespace, we should give the subsubmodules an underscore in the name. It makes no sense to have the same functionality exposed in two different places. Unless we'd want to clean up the main namespace for some of that stuff in the long term. Then I think we only have a few left - mixins, recfunctions, npyio, stride_tricks. Cheers, Ralf
Cheers,
Sebastian
- it should be explicitly documented that this is a "lower bar namespace" and that we discourage other array/tensor libraries from copying its API
We had a good discussion about this in the community meeting yesterday. Sebastian volunteered to sort out the star-import issue.
I've been thinking about something possibly similar for sympy which also has a bloated top-level namespace (and has no other place for public API to go).
A larger plan for cleaning up main namespace bloat, as well as dealing with our unmaintained namespaces (numpy.dual, numpy.emath, etc.) is still needed.
Cheers, Ralf _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
participants (2)
-
Ralf Gommers
-
Sebastian Berg