On Sun, Jul 22, 2018 at 12:28 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
Then, I think it's not unreasonable to draw a couple of hard lines. For example, removing complete submodules like linalg or random has ended up on some draft brainstorm roadmap list because someone (no idea who) put it there after a single meeting. Clearly the cost-benefit of that is such that there's no point even discussing that more, so I'd rather draw that line here than every time someone open an issue.

I'm happy to give the broader context here. This came up in the NumPy sprint in Berkeley back in May of this year.

The existence of all of these submodules in NumPy is mostly a historical artifact, due to the previously poor state of Python packaging. Our thinking was that perhaps this could be revisited in this age of conda and manylinux wheels.

This isn't to say that it would actually be a good idea to remove any of these submodules today. Separate modules bring both benefits and downsides.

Benefits:
- It can be easier to maintain projects separately rather than inside NumPy, e.g., bug fixes do not need to be tied to NumPy releases.
- Separate modules could reduce the maintenance burden for NumPy itself, because energy gets focused on core features.
- For projects for which a rewrite would be warranted (e.g., numpy.ma and scipy.sparse), it is *much* easier to innovate outside of NumPy/SciPy.
- Packaging. As mentioned above, this is no longer as beneficial as it once way.

Downsides:
- It's harder to find separate packages than NumPy modules.
- If the maintainers and maintenance processes are very similar, then separate projects can add unnecessary overhead.
- Changing from bundled to separate packages imposes a significant cost upon their users (e.g., due to changed import paths).

Coming back to the NEP:

The import on downstream libraries and users would be very large, and 
maintenance of these modules would still have to happen.  Therefore this is simply not a good idea; removing these submodules should not happen even for a new major version of NumPy.

I'm afraid I disagree pretty strongly here. There should absolutely be a high bar for removing submodules, but we should not rule out the possibility entirely.

It is certainly true that modules need to be maintained for them to be remain usable, but I particularly object to the idea that this should be forced upon NumPy maintainers. Open source projects need to be maintained by their users, and if their users cannot devote energy to maintain them then the open source project deserves to die. This is just as true for NumPy submodules as for external packages.

NumPy itself only has an obligation to maintain submodules if they are actively needed by the NumPy project and valued by active NumPy contributors. Otherwise, they should be maintained by users who care about them -- whether that means inside or outside NumPy. It serves nobody well to insist on NumPy developers maintaining projects that they don't use or care about.

I like would suggest the following criteria for considering removing a NumPy submodule:
1. It cannot be relied upon by other portions of NumPy.
2. Either
(a) the submodule imposes a significant maintenance burden upon the rest of NumPy that is not balanced by the level of dedicated contributions, or
(b) much better alternatives exist outside of NumPy

Preferably all three criteria should be satisfied.