[Numpy-discussion] feedback request: proposal to add masks to the core ndarray
Laurent Gautier
lgautier at gmail.com
Fri Jun 24 11:07:06 EDT 2011
On 2011-06-24 16:43, Robert Kern <robert.kern at gmail.com> wrote:
> On Fri, Jun 24, 2011 at 09:33, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern<robert.kern at gmail.com> wrote:
>>> >> The alternative proposal would be to add a few new dtypes that are
>>> >> NA-aware. E.g. an nafloat64 would reserve a particular NaN value
>>> >> (there are lots of different NaN bit patterns, we'd just reserve one)
>>> >> that would represent NA. An naint32 would probably reserve the most
>>> >> negative int32 value (like R does). Using the NA-aware dtypes signals
>>> >> that you are using NA values; there is no need for an additional flag.
>> >
>> > Definitely better names than r-int32. Going this way has the advantage of
>> > reducing the friction between R and numpy, and since R has pretty much
>> > become the standard software for statistics that is an important
>> > consideration.
> I would definitely steal their choices of NA value for naint32 and
> nafloat64. I have reservations about their string NA value (i.e. 'NA')
> as anyone doing business in North America and other continents may
> have issues with that....
May be there is not so much need for reservation over the string NA,
when making the distinction between:
a- the internal representation of a "missing string" (what is stored in
memory, and that C-level code would need to be aware of)
b- the 'external' representation of a missing string (in Python, what
would be returned by repr() )
c- what is assumed to be a missing string value when reading from a file.
a/ is not 'NA', c/ should be a parameter in the relevant functions, b/
can be configured as a module-level, class-level, or instance-level
variable.
More information about the NumPy-Discussion
mailing list