Road to NumPy 2.0
Note: This is a living document. We are prepared to modify it through continued dialogue with the community. Its acceptance indicates consensus on the process and timelines.
Abstract
NumPy 2.0 release is an opportunity to make some complex changes for which a normal deprecation wouldn’t be viable as the user impact may be larger than is normally considered acceptable for a minor release. Yet, NumPy 2.0 is not meant to be a large breaking release. Most users should not need to worry about introduced changes.
This document contains essential information about the work on NumPy 2.0 release.
Motivation and impact
NumPy 2.0 release is required for fixing old bugs and modernizing NumPy’s code base. It is not planned to be a “break the world release”. This means:
- It must be possible to compile downstream packages to be compatible with both new and old NumPy versions. However, the C-API is expected to be broken. The path to achieve this compatibility will be defined as a high priority project.
- The majority of users should not require code updates or such updates should be very easy to do. Expert users are likely to notice changes though.
- We accept that some NumPy users may not able to adopt NumPy 2.0 immediately or may have to wait until following releases for adoption.
One should keep in mind that even bug fixes can break the code of a small number of users.
Timeline
NumPy 2.0 will be scheduled for release in Jan 2024. Projects and changes should be proposed as soon as possible. We propose a NumPy team meeting around April 2023 (details to be discussed) in order to finalize high-impact projects and review all candidate projects.
Projects not proposed by this time may not be prioritized for a final 2.0 release.
Changes which can be implemented using a feature-flag are strongly encouraged as it simplifies keeping projects moving.
Project selection process
To determine the scope of work for NumPy 2.0 release, we suggest introducing three categories of projects/proposals:
- high: proposal requires high visibility or may be critical for the NumPy 2.0 release,
- normal,
- candidate: changes which are in an early planning stage.
High priority proposals will be listed explicitly in this NEP.
A project board will track all projects proposed for NumPy 2.0, distinguishing the category and progress. Proposing a project for NumPy 2.0 release
To start a project, there is one important thing: Believe that your change makes NumPy better and commit to trying to make it happen.
To have a proposal listed on the NumPy 2.0 project board, we require the following:
At least two champions for each proposal, one of whom must be a NumPy core developer or similar to one in standing.
A brief assessment of the anticipated impact on downstream and end-users. This means assessing how many users/what groups of users are affected and in what way.
Support by the NumPy community or Steering Council (ideally both). Positive feedback to your proposal on the NumPy mailing list is a strong indicator of the community support.
If any of the above requirements are not met, proposals will be listed as “candidate”. NumPy maintainers will review “candidate” projects on a case by case basis.
We suggest including a brief header in every proposal (issue or PR):
* **Champions**:
* **Severity**: How does it affect users?
* **Affects**: Who/how many users does it affect?
Any further details or adjustments shall be added on request. Large changes may require their own NEP when requested by a maintainer.
As a suggestion, “affects” could be roughly guided by the number of users: rare, limited, common, and ubiquitous. While “severity” could be minor, typical (code update needed), severe (e.g. large change/difficult to find), critical (incorrect results or no clear path for fixing things). The two together can then be used as a basis for decision making and discussion.
Scope of work
High priority projects
The projects in this section are considered high impact from the compatibility point of view.
Unless otherwise noted, these are currently proposals, most of these changes have their own NEPs which should be accepted.
Enable breaking the C-API
NumPy needs to define a process for breaking C-API. This project does not define what is broke, this is done separately on a case-by-case basis.
We simply assume that sufficient changes will be done to make this worthwhile.
- Status: Planning
- Champion: Matti Picus (?), Sebastian Berg (?)
- Severity: Severe (for maintainers without a plan), typical for users
- Affects: Library maintainers, some users
- Notes:
- Many users may have issues if pip installing a very new NumPy version without updating other libraries. We assume that this isn’t a common scenario and will mostly result in clear errors.
- All libraries will have to be recompiled. The transition plan will ensure that libraries adhering to best practices will have an easy transition.
Note: A full plan is still outstanding and may require its own NEP.
Adopt NEP 50
Adopting NEP 50 changes the promotion behavior of NumPy scalars by removing any value-based casting. Details for this change are discussed in :ref:NEP50
.
- Status: Largely implemented, but open for discussions and open questions to be addressed.
- Champion: Sebastian Berg, …
- Severity: High in rare cases, some results can change or memory can bloat.
- Affects: Many users, but hopefully not most as one needs to use smaller than default precision types to be affected.
A thorough cleanup of the Python API
The NumPy API is quite messy, with many functions and aliases that are not recommended for use, namespaces that are private but missing underscores, inconsistencies in argument names, and more. Changes will include removing aliases and outdated functionality (including many things that have been doc-deprecated already), making namespaces private, and making function signatures more consistent.
- Status: Needs a separate NEP, and deprecations in 1.25.0 for what can be deprecated in a sensible way.
- Champion: Ralf Gommers, Stefan van der Walt, …
- Severity: Medium. It is expected that a lot of projects and users will see some breakage, but also that code changes to more idiomatic usage will be straightforward and compatible with both numpy 1.X and 2.0
- Affects: Many users and downstream projects
Add array API standard support to the main namespace
The main reason NEP 47 aimed for a separate numpy.array_api
submodule rather than the main namespace is that casting rules differed too much. With NEP 50 (see above), that will be resolved in NumPy 2.0. Having NumPy be a superset of the array API standard will be a significant improvement for code portability to other libraries (CuPy, JAX, PyTorch, etc.) and thereby address one of the top user requests from the 2020 NumPy user survey (GPU support). See the numpy.array_api
API docs for an overview of differences between it and the main namespace (the “strictness” ones are not applicable). - Status: separate NEP to be written.
- Champion: Aaron Meurer, Ralf Gommers
- Severity: Medium. Most impact of breaking changes is likely concentrated in a few widely used APIs (e.g., change semantics of
copy=False
keyword to actually mean “don’t copy” rather than “copy if needed”) - Affects: most users and downstream projects
Other projects
_______________________________________________