[Numpy-discussion] NA masks in the next numpy release?
Chris.Barker
Chris.Barker at noaa.gov
Fri Oct 28 19:05:43 EDT 2011
On 10/28/11 11:37 AM, Matthew Brett wrote:
> The main motivation for the alterNEP was our strong feeling that
> separating ABSENT and IGNORE was easier to comprehend and cleaner.
I don't know about easier to comprehend, or cleaner, but it is more
feature-full.
I see two issues here:
1) being able to distinguish between "ignore" and "not valid"
-- and being able to stop ignoring an ignored value.
This could quite easily be accomplished with a mask approach -- indeed
with 8 bits, you could have 8 different possible masked states (not that
I'm suggesting that, at least not in core numpy.)
However, with a bit-pattern approach, you simply can't implement
"ignore". Once it's been set, the previous value is lost.
2) data size: A full mask takes extra space, sometimes a substantial
amount -- so a bit-pattern approach would be nice.
I like the idea (that I think Mark attempted to implement) that the
implementation should be hidden from the user - not necessarily entirely
hidden, but subtle enough that that casual user wouldn't need to care
about it.
In that case, I think if we could decide that we want both "ignore" and
"not valid" (and it seems there is a fair bit of interest in that), then
we can proceed with a mask-based approach, and develop an API that makes
as little reference to the mask as possible.
Then a bit-pattern approach could be developed that uses the same API --
it would not have the "ignore" option at all, but would be the same for
the "not valid" option.
When I write this it seem entirely too complicated for both the
developers and users, but maybe it's not -- it could be analogous to
what we have now: arrays can be Fortran or C ordered, contiguous or not,
be views on other arrays or not. To really make numpy dance, you need to
understand all that, but you can also do a whole lot, and write a lot of
generic code, without digging into that.
If we do all that, maybe there could be a sparse mask implementation,
etc. as well.
Maybe I'm dreaming, though...
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list