Hi all,

Today we merged the PR that publicly exposed the formerly "experimental" DType API and  ArrayMethod API. See https://github.com/numpy/numpy/pull/25754.

The docs for the new C API are here:

https://numpy.org/devdocs/reference/c-api/array.html#arraymethod-api
https://numpy.org/devdocs/reference/c-api/types-and-structures.html#arraymethod-structs
https://numpy.org/devdocs/reference/c-api/array.html#custom-data-types
https://numpy.org/devdocs/reference/c-api/types-and-structures.html#dtypemeta

The DType API publicly exposes the PyArray_DTypeMeta C struct, which represents DType metaclasses. It also exposes a function for registering user-defined DTypes and a set of slot IDs and function typedefs that users can implement in C to write new DTypes.

The ArrayMethod API allows defining cast and ufunc loops in terms of these new DTypes, in a manner that forbids value-based promotion and abstracts many of the internals of NumPy. We hope the ArrayMethod API is enables sharing low-level loops that work out-of-the-box in NumPy in other projects.

I used the DType API to write the new StringDType that was recently added to numpy. One of the goals of this API is both to make it easier for the community to write new DTypes but also for people to experiment with DTypes outside of NumPy, prove community need and viability, and then upstream them into NumPy as a self-contained artifact without a need for a deep knowledge of numpy internals.

This is still a C API and it does require knowledge of the CPython and NumPy C APIs and the correct way to use them, but it is now substantially easier than it was before to write new DTypes.

It is our goal that new DTypes will be generally compatible with downstream users of NumPy. If your project uses the NumPy python API, then it's likely this is already the case, although there may still be some wrinkles if you do introspection of DTypes. If you use the NumPy C API, and in particular, if you use type numbers to specify numpy DTypes, it's likely that your C code will need some updating to work properly with new DTypes.

All that to say, the 2.0 release still isn't final and we'd love feedback on any part of this. This is all new API surface so we have a unique chance to fix mistakes before the API is fully public in the final numpy 2.0 release. Things are still in a little flux - we still need to update the example user DType implementations in the numpy-user-dtypes to use the final public API - but now is probably as good a time as ever to start writing a new DType if you've ever been interested. We have a #user-dtypes channel on the numpy community slack if you're interested in chatting about this in a low-latency context - contact me off-list if you want an invite.

This caps off several years of work from many people, including everyone involved in Nep 40-44, which describe this new API. In particular I'd like to highlight Sebastian Berg, who led this whole effort, Matti Picus, Stéfan van der Walt, Ben Nathanson, Marten van Kerkwijk, who co-authored the NEPs, Ralf Gommers who helped get funding for my work and provided mentoring and coordination, Charles Harris for providing leadership, context, and advice, and many others who have contributed in big and small ways.

-Nathan