[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Robert Kern robert.kern at gmail.com
Fri Jun 24 10:35:53 EDT 2011


On Fri, Jun 24, 2011 at 09:24, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern <robert.kern at gmail.com> wrote:
>
>> The alternative proposal would be to add a few new dtypes that are
>> NA-aware. E.g. an nafloat64 would reserve a particular NaN value
>> (there are lots of different NaN bit patterns, we'd just reserve one)
>> that would represent NA. An naint32 would probably reserve the most
>> negative int32 value (like R does). Using the NA-aware dtypes signals
>> that you are using NA values; there is no need for an additional flag.
>
> I don't understand the numpy design and maintainable issues, but from
> a user perspective (mine) nafloat64, etc sounds nice.

It's worth noting that this is not a replacement for masked arrays,
nor is it intended to be the be-all, end-all solution to missing data
problems. It's mostly just intended to be a focused tool to fill in
the gaps where masked arrays are less convenient for whatever reason;
e.g. where you're tempted to (ab)use NaNs for the purpose and the
limitations on the range of values is acceptable. Not every dtype
would have an NA-aware counterpart. I would suggest just nabool,
nafloat64, naint32, nastring (a little tricky due to the flexible
size, but doable), and naobject. Maybe a couple more, if we get
requests, like naint64 and nacomplex128.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list