On Wed, Jul 6, 2011 at 7:34 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Jul 6, 2011 at 8:09 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Jul 6, 2011 at 7:01 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
Numpy already has a general mechanism for defining new dtypes and slotting them in so that they're supported by ndarrays, by the casting machinery, by ufuncs, and so on. In principle, we could implement
Well, actually not in any useful sense, take a look at what Mark went through for the half floats. There is a reason the NEP went with parametrized dtypes and masks. But we would sure welcome a plan and code to make it true, it is one of the areas that could really use improvement.
Err, yes, that's basically what the next few sentences say?
This is basically a draft spec for implementing the parametrized dtypes idea.
And yet:
FIXME: this really needs attention from an expert on numpy's casting rules. But I can't seem to find the docs that explain how casting loops are looked up and decided between (e.g., if you're casting from dtype A to dtype B, which dtype's loops are used?), so I can't go into details. But those details are tricky and they matter...
There is also a reason that masks were chosen to be implemented first. The numpy code is freely available and there is no reason not to make experiments or help Mark get some of the current problems solved, it doesn't need to be a one man effort and your feedback will have a lot more impact if you are in the trenches. In particular, I think there is a good deal of work that will need to be done for the sorts, argmax, and the other functions you mention that would give you a good idea of what was involved and how to go about implementing your ideas.
Hi Chuck, My goal in posting this was to try to find a way for those of us who disagree to still be productive together. If you'd like to help with that in a constructive way, then please do, but otherwise, can I ask in a polite and well-meaning way that you butt out? Scolding me for not getting "in the trenches" is not helpful. People like Wes and Matthew and I have been "in the trenches" for years building up numpy as a viable platform for statistical computing. (I can't claim that my efforts compare to theirs, but see for instance [1], which is an improved version of R's formula support, one of the other key advantages it has over Python. It works, so I'd have written some docs and released it by now, except I'm defending my PhD in 4 weeks, so, well, you know.) Yes, there are some details missing from the spec I wrote up in a few hours this afternoon, but how about we solve them? There are plenty of people on this list who know more than me, or Mark, or any one of any of us. This problem is complicated, but not *that* complicated. So, you know, let's do this. And maybe that way, in a month, we'll have something that we all actually like, even if it doesn't do everything that we want. -- Nathaniel [1] https://github.com/charlton/charlton