[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Fri Jun 24 11:07:55 EDT 2011

Hi,

On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Fri, Jun 24, 2011 at 09:33, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>> On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern <robert.kern at gmail.com> wrote:
>
>>> The alternative proposal would be to add a few new dtypes that are
>>> NA-aware. E.g. an nafloat64 would reserve a particular NaN value
>>> (there are lots of different NaN bit patterns, we'd just reserve one)
>>> that would represent NA. An naint32 would probably reserve the most
>>> negative int32 value (like R does). Using the NA-aware dtypes signals
>>> that you are using NA values; there is no need for an additional flag.
>>
>> Definitely better names than r-int32. Going this way has the advantage of
>> reducing the friction between R and numpy, and since R has pretty much
>> become the standard software for statistics that is an important
>> consideration.
>
> I would definitely steal their choices of NA value for naint32 and
> nafloat64. I have reservations about their string NA value (i.e. 'NA')
> as anyone doing business in North America and other continents may
> have issues with that....

It would certainly help me at least if someone (Mark?  sorry to
ask...) could set out the implementation and API differences that
would result from the two options:

1) array.mask option - an integer array of shape array.shape giving
mask (True, False) values for each element
2) nafloat64 option - dtypes with specified dtype-specific missing values

Best,

Matthew