[Numpy-discussion] NA masks in the next numpy release?

Chris.Barker Chris.Barker at noaa.gov
Fri Oct 28 12:21:46 EDT 2011


On 10/27/11 7:51 PM, Travis Oliphant wrote:
> As I mentioned. I find the ability to separate an ABSENT idea from an
> IGNORED idea convincing. In other words, I think distinguishing between
> masks and bit-patterns is not just an implementation detail, but
> provides a useful concept for multiple use-cases.

Exactly -- while one can implement ABSENT with a mask, one can not 
implement IGNORE with a bit-pattern. So it is not an implementation detail.

I also think bit-patterns are a bit of a dead end:

- there is only a standard for one data type family: i.e. NaN for ieee 
float types

- So we would be coming up with our own standard (or adopting an 
existing one, but I don't think there is one widely supported) for other 
types. This means:
   1) a lot of work to do
   2) a binary format incompatible with other code, compilers, etc. This 
is a BIG deal -- a major strength of numpy is that it serves as a 
wrapper for a data block that is compatible with C, Fortran or whatever 
code -- special bit patterns would make this a lot harder.

We also talked about the fact that a 8-bit mask provides the ability to 
carry other information in the mask -- not jsut "missing" or "ignored", 
but a handful of other possible reasons for masking. I think that has a 
lot of possibilities.

On 10/28/11 2:11 AM, Stéfan van der Walt wrote:
> Another data point:  I've been spending some time on scikits-image
> recently, and although masked values would be highly useful in that
> context, the cost of doubling memory use (for uint8 images, e.g.) is
> too high.

> 2) that we make a concerted effort to implement the bitmask mode of
> operation as soon as possible.

I wonder if that might be handled as a scikits-image extension, rather 
than core numpy?

Is there a standard bit pattern for missing data in images? -- it's 
presumable quite important to maintain binary compatibility with image 
formats, processing tools, etc.

I guess what I'm getting at is that special bit-pattern implementations 
may be domain specific.

-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list