[Numpy-discussion] A reimplementation of MaskedArray

Tue Nov 21 21:11:15 EST 2006

I think that the new implementation is making a copy of the data with
indexing a MA. This is different from both ndarray and the existing
numpy ma version.

e.g.
testma = ma.array([[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]], mask=ma.nomask)
testma2 =  testma[1]
testma2[1] = 20
print testma
print testma2

--output--
[[1 2 3 4 5]
 [1 2 3 4 5]
 [1 2 3 4 5]]
[ 1 20  3  4  5]

Having subviews of the mask seems complicated with the mask being
nomask. What happens if the view sets a new masked value and hence
changes from nomask to an boolean array? How does the parent mask get
updated? I think the numpy implementation gets away with this by
returning a view of only the _data part if the ma mask is nomask - I
don't like this solution as I would expect a ma to be returned. Also I
suspect that if the ma is to be a view of another ma, then in __new__
a mask that is a boolean array of all False cannot be converted to
nomask.

I like the new implementation of maskedarray, especially the focus on
simplicity. The only simple solution I see is to have the mask be a
boolean array at all times. Or at least as soon as a view is required
of a ma if mask is nomask then it should be converted to a boolean
array and then both a view of the boolean array mask as well as the
data is used in the view ma returned.