On Wed, Jul 6, 2011 at 8:34 PM, Charles R Harris <charlesr.harris@gmail.com>wrote:
On Wed, Jul 6, 2011 at 8:09 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Jul 6, 2011 at 7:01 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
Numpy already has a general mechanism for defining new dtypes and slotting them in so that they're supported by ndarrays, by the casting machinery, by ufuncs, and so on. In principle, we could implement
Well, actually not in any useful sense, take a look at what Mark went through for the half floats. There is a reason the NEP went with parametrized dtypes and masks. But we would sure welcome a plan and code to make it true, it is one of the areas that could really use improvement.
Err, yes, that's basically what the next few sentences say?
This is basically a draft spec for implementing the parametrized dtypes idea.
And yet:
FIXME: this really needs attention from an expert on numpy's casting rules. But I can't seem to find the docs that explain how casting loops are looked up and decided between (e.g., if you're casting from dtype A to dtype B, which dtype's loops are used?), so I can't go into details. But those details are tricky and they matter...
There is also a reason that masks were chosen to be implemented first. The numpy code is freely available and there is no reason not to make experiments or help Mark get some of the current problems solved, it doesn't need to be a one man effort and your feedback will have a lot more impact if you are in the trenches. In particular, I think there is a good deal of work that will need to be done for the sorts, argmax, and the other functions you mention that would give you a good idea of what was involved and how to go about implementing your ideas.
Let me lay out a bit more how I see things developing at this point, and bear in mind that I am not a psychic so this is just a guess ;) Mark is going to work at Enthought for maybe 3-4 more weeks and then return to school. Mark is very good, but that is still a very tough schedule and all the things in the NEP may not get finished, let alone all the supporting work that will be needed around the core implementation. After that what Mark does in his spare time is up to him. I expect there will be another numpy release sometime in the Fall, maybe around Nov/Dec, to get the new features, especially the datetime work, out there. At that point the interface is semi-fixed. I like to think that new features should be regarded as experimental for at least one release cycle but that is certainly not official Numpy policy. In any case there is likely going to be a gap of several months where the rate of commits slows down and other folks, if they are interested, have a real opportunity to get involved. After the projected Fall release I see maybe another six months to make changes/extensions to the interface, and this is where new ideas can get worked out, but there needs to be someone with the interest and skill to implement those ideas for that to happen. If no such person shows up, then the interface will be what it is until there is such a person with an interest in carrying things forward. But at that point they will need take care to maintain backward compatibility unless pretty much everyone agrees that the then current interface is a disaster. Chuck