Hey all, 

To the NumPy list only,  I'll at least give the highlights of the surgical approach I would like to get someone to work on -- I can help mentor and guide.   These are just the highlights, but it should give someone familiar with the code the general gist.  There are some details to work out, of course, but it could be done. 

It may be very similar to what Nathaniel is contemplating --- except I think breaking the ABI is the only way to really do this --- could be wrong but I'm not wiling to risk *not* just breaking the ABI. 

1) Create a new meta-type in C (call it dtype)
2) Create Python Classes (in C) that are instances of this meta-type for each "kind" of data-type
3) Make PyArray_Descr * be a reference to one of these new objects (which can be built either in C or Python) and should be published outside NumPy as well. 
4) Remove most of the "per-type function calls" in PyArray_ArrFuncs --- instead replacing those with the Generalized Ufunc equivalents and expand the capability of Generalized Ufuncs
5) Keep the Array Scalar Types but change them so that they also use the dtype meta-type as their foundation and mixin an array-methods type.      Also, have these be in a separate project from NumPy itself.    
6) The current void* would be replaced with real Python classes instead of structured arrays being shoved through a single data-type. 
7) The documented ways to spell a dtype would be reduced --- but backwards compatibility would be preserved. 
8) Make sure Numba can create these Descriptor objects with Ahead of Time Compilation and start to move code of NumPy to Numba
9) Ensure the Generalized Ufunc framework can take the data-type as an argument so that *all* data-types can participate in the general multi-method approach. 

There is more to it, but that is the basic idea.    Please forgive me if I can't respond to any feedback from the list in a timely way.  I will as I can. 

-Travis



--

Travis Oliphant
Co-founder and CEO


@teoliphant
512-222-5440