[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Robert Kern robert.kern at gmail.com
Thu Jun 23 17:44:10 EDT 2011


On Thu, Jun 23, 2011 at 15:53, Mark Wiebe <mwwiebe at gmail.com> wrote:
> Enthought has asked me to look into the "missing data" problem and how NumPy
> could treat it better. I've considered the different ideas of adding dtype
> variants with a special signal value and masked arrays, and concluded that
> adding masks to the core ndarray appears is the best way to deal with the
> problem in general.
> I've written a NEP that proposes a particular design, viewable here:
> https://github.com/m-paradox/numpy/blob/cmaskedarray/doc/neps/c-masked-array.rst
> There are some questions at the bottom of the NEP which definitely need
> discussion to find the best design choices. Please read, and let me know of
> all the errors and gaps you find in the document.

One thing that could use more explanation is how your proposal
improves on the status quo, i.e. numpy.ma. As far as I can see, you
are mostly just shuffling around the functionality that already
exists. There has been a continual desire for something like R's NA
values by people who are very familiar with both R and numpy's masked
arrays. Both have their uses, and as Nathaniel points out, R's
approach seems to be very well-liked by a lot of users. In essence,
*that's* the "missing data problem" that you were charged with: making
happy the users who are currently dissatisfied with masked arrays. It
doesn't seem to me that moving the functionality from numpy.ma to
numpy.ndarray resolves any of their issues.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list