[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Pierre GM pgmdevlist at gmail.com
Thu Jun 23 18:24:10 EDT 2011


On Jun 23, 2011, at 11:55 PM, Mark Wiebe wrote:

> On Thu, Jun 23, 2011 at 4:46 PM, Charles R Harris <charlesr.harris at gmail.com> wrote:
> On Thu, Jun 23, 2011 at 2:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> Enthought has asked me to look into the "missing data" problem and how NumPy could treat it better. I've considered the different ideas of adding dtype variants with a special signal value and masked arrays, and concluded that adding masks to the core ndarray appears is the best way to deal with the problem in general.
> 
> I've written a NEP that proposes a particular design, viewable here:
> 
> https://github.com/m-paradox/numpy/blob/cmaskedarray/doc/neps/c-masked-array.rst

Mmh, after timeseries, now masked arrays... Mark, I start to see a pattern here ;)


> There are some questions at the bottom of the NEP which definitely need discussion to find the best design choices. Please read, and let me know of all the errors and gaps you find in the document.
> 
> 
> I agree that low level support for masks is the way to go.

The objective was to have numpy.ma in C, yes. Been clear since Numeric, but nobody had time to do it. And I still don't speak C, so Python it was. 
Anyhow, yes, there should be some work to address some of numpy.ma shortcomings. I may a bit conservative, but I don't really see the reason to follow a radically different approach. Your idea of switching the current convention of mask (a True meaning that the data can be accessed) will lead to a lot of fun indeed. And sorry, what "general consensus about masks elsewhere" are you referring to ?

> There is some consternation about the conventional True/False
> interpretation of the mask, centered around the name "mask". 

Don't call it "mask" at all then. "accessible" ? "access" ? Avoid "valid", it's too connotated.





More information about the NumPy-Discussion mailing list