[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Fri Jun 24 20:02:22 EDT 2011

On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney <wesmckinn at gmail.com> wrote:

> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett <matthew.brett at gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root <ben.root at ou.edu>
> wrote:
> >> ...
> >> > Again, there are pros and cons either way and I see them very
> orthogonal
> >> > and
> >> > complementary.
> >>
> >> That may be true, but I imagine only one of them will be implemented.
> >>
> >> @Mark - I don't have a clear idea whether you consider the nafloat64
> >> option to be still in play as the first thing to be implemented
> >> (before array.mask).   If it is, what kind of thing would persuade you
> >> either way?
> >>
> >
> > Mark can speak for himself,  but I think things are tending towards
> masks.
> > They have the advantage of one implementation for all data types, current
> > and future, and they are more flexible since the masked data can be
> actual
> > valid data that you just choose to ignore for experimental  reasons.
> >
> > What might be helpful is a routine to import/export R files, but that
> > shouldn't be to difficult to implement.
> >
> > Chuck
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
> Perhaps we should make a wiki page someplace summarizing pros and cons
> of the various implementation approaches? I worry very seriously about
> adding API functions relating to masks rather than having special NA
> values which propagate in algorithms. The question is: will Joe Blow
> Former R user have to understand what is the mask and how to work with
> it? If the answer is yes we have a problem. If it can be completely
> hidden as an implementation detail, that's great. In R NAs are just
> sort of inherent-- they propagate you deal with them when you have to
> via na.rm flag in functions or is.na.
>
>
Well, I think both of those can be pretty transparent. Could you illustrate
some typical R usage, to wit.

1) setting a value to na
2) checking a value for na

Other things are problematic, like checking for integer overflow. For safety
that would be desireable, for speed not. I think that is a separate question
however. In any case, if we do check such things we should be able to set
the corresponding mask value in the loop, and I suppose that is the sort of
thing you want.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110624/c46d1289/attachment.html>