Hi all,

I thought I would give a brief update on where we are with new DTypes. Partially for Matti who is braving the brunt of the review, but also for anyone else interested.  Please don't hesitate to ask for clarifications, any questions, or to schedule a meeting to discuss!


Recap


The past year, has seen most of the "big picture" changes merged into NumPy, a good chunk already part of 1.20:

With the exception of universal functions, the above list covers all major areas of change in NumPy that are required to change. It also implements many of the things that new user DTypes will need and currently cannot do. Previously, these were either unavailable or limited in various ways; especially when it comes to parametric DTypes such as units or strings.


Currently in Progress


The current main reamining points are the universal functions. Since, a majority of NumPy features are organized as universal functions, and universal functions inheritently did not support parametric user defined DTypes. These need a major change. This change is proposed in NEP 43 (although that will need some smaller updates).

The work on implemeting it, is mostly settling in the following PR and the following branch (I hope these will move in very soon):
In parallel, the new DType API is only useful for users once it is exposed, I have a branch here to experiment with that:
The exact way to write a new DType probably needs some alternative. But note that this should largely be limited to the boilerplate code.


Future


The main step still remaining is figuring out how to exactly expose the DType API best (ABI compatibility is the major concern) and finishing the NEP 43 (or most of it) as closing up.

After that there are still some things that need to be done (although, this is unlikely to be exhaustive):

But most importantly, whatever comes up when potential users start exploring the API, hopefully soon!

Otherwise, there are a couple of related improvements, that I think would make sense. Such as considering storing the actual power-of-two alignment in the array flags (they are getting a bit cramped if we assume int can be 16 bits though). Also the discussion about removing value based casting/promotion is one that would help with DTypes and pushing it forward probably makes sense as soon as the items that are "currently in progress" are largely settled and the next NumPy version is released.


Cheers,

Sebastian