[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Pierre GM pgmdevlist at gmail.com
Thu Jun 23 20:29:47 EDT 2011


On Jun 24, 2011, at 2:21 AM, Charles R Harris wrote:

> 
> 
> On Thu, Jun 23, 2011 at 6:00 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Jun 23, 2011 at 2:44 PM, Robert Kern <robert.kern at gmail.com> wrote:
> > On Thu, Jun 23, 2011 at 15:53, Mark Wiebe <mwwiebe at gmail.com> wrote:
> >> Enthought has asked me to look into the "missing data" problem and how NumPy
> >> could treat it better. I've considered the different ideas of adding dtype
> >> variants with a special signal value and masked arrays, and concluded that
> >> adding masks to the core ndarray appears is the best way to deal with the
> >> problem in general.
> >> I've written a NEP that proposes a particular design, viewable here:
> >> https://github.com/m-paradox/numpy/blob/cmaskedarray/doc/neps/c-masked-array.rst
> >> There are some questions at the bottom of the NEP which definitely need
> >> discussion to find the best design choices. Please read, and let me know of
> >> all the errors and gaps you find in the document.
> >
> > One thing that could use more explanation is how your proposal
> > improves on the status quo, i.e. numpy.ma. As far as I can see, you
> > are mostly just shuffling around the functionality that already
> > exists. There has been a continual desire for something like R's NA
> > values by people who are very familiar with both R and numpy's masked
> > arrays. Both have their uses, and as Nathaniel points out, R's
> > approach seems to be very well-liked by a lot of users. In essence,
> > *that's* the "missing data problem" that you were charged with: making
> > happy the users who are currently dissatisfied with masked arrays. It
> > doesn't seem to me that moving the functionality from numpy.ma to
> > numpy.ndarray resolves any of their issues.
> 
> Speaking as a user who's avoided numpy.ma, it wasn't actually because
> of the behavior I pointed out (I never got far enough to notice it),
> but because I got the distinct impression that it was a "second-class
> citizen" in numpy-land. I don't know if that's true. But I wasn't sure
> how solidly things like interactions between numpy and masked arrays
> worked, or how , and it seemed like it had more niche uses. So it just
> seemed like more hassle than it was worth for my purposes. Moving it
> into the core and making it really solid *would* address these
> issues...
> 
> 
> There is some truth to that. The maintainer/creator of masked arrays was Pierre and he hasn't much time these days.

So true. I'd be delighted to work at least part time on it, though. Send me a job offer :)




More information about the NumPy-Discussion mailing list