[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Mark Wiebe mwwiebe at gmail.com
Thu Jun 23 17:51:31 EDT 2011


On Thu, Jun 23, 2011 at 4:44 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Thu, Jun 23, 2011 at 15:53, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > Enthought has asked me to look into the "missing data" problem and how
> NumPy
> > could treat it better. I've considered the different ideas of adding
> dtype
> > variants with a special signal value and masked arrays, and concluded
> that
> > adding masks to the core ndarray appears is the best way to deal with the
> > problem in general.
> > I've written a NEP that proposes a particular design, viewable here:
> >
> https://github.com/m-paradox/numpy/blob/cmaskedarray/doc/neps/c-masked-array.rst
> > There are some questions at the bottom of the NEP which definitely need
> > discussion to find the best design choices. Please read, and let me know
> of
> > all the errors and gaps you find in the document.
>
> One thing that could use more explanation is how your proposal
> improves on the status quo, i.e. numpy.ma. As far as I can see, you
> are mostly just shuffling around the functionality that already
> exists.


Please read my proposal in more detail. numpy.ma has many problems of
inconsistency with numpy which by itself makes it difficult to work with,
and while I read the ma documentation, that's not my starting point.


> There has been a continual desire for something like R's NA
> values by people who are very familiar with both R and numpy's masked
> arrays. Both have their uses, and as Nathaniel points out, R's
> approach seems to be very well-liked by a lot of users. In essence,
> *that's* the "missing data problem" that you were charged with: making
> happy the users who are currently dissatisfied with masked arrays.


I'm not intimately familiar with R, so anyone who is and has time to provide
detailed feedback will be much appreciated.


> It doesn't seem to me that moving the functionality from numpy.ma to
> numpy.ndarray resolves any of their issues.
>

The design also needs to be clarified and made properly consistent, after
which I would be surprised it doesn't resolve the issues.

-Mark


>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>   -- Umberto Eco
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110623/5aaa1c73/attachment.html>


More information about the NumPy-Discussion mailing list