[Numpy-discussion] Trouble With MaskedArray and Shared Masks

Wed Feb 27 10:54:35 EST 2008

Alexander,

> create the MaskedArray to:
> >>> a = numpy.ma.MaskedArray(
>
> ...     data=numpy.zeros((4,5), dtype=float),
> ...     mask=True,
> ...     fill_value=0.0
> ... )

By far the easiest indeed.

> >  So: should we introduce this extra parameter ?
>
> The propagation semantics and mechanics are definitely tricky,
> especially considering that it seems that the "right behavior" is
> context dependent. Are the mask propagation rules spelled out anywhere
> (aside from the code! :-))? 

Mmh, no: we tried to avoid mask propagation as much as possible, as it can 
have some fairly disastrous side-effects. In other terms, no propagation by 
default when a mask is shared, propagation when the mask is not shared.

> I could see some potential value to an 
> additional argument, but the constructor is already quite complicated
> so I'm reluctant to say "Yes" outright, especially with my current
> level of understanding. 

Yes, there are already a lot of parameters, some more useful than others:
hard_mask : if True, prevent a masked value to be accidentally unmasked.
shrink: if True, force a mask full of False to nomask
keep_mask : when creating a new masked_array for an existing one, specifies 
whether the old mask should be taken into account or not. By default, 
keep_mask is True

For example:
>>>import numpy.mas as ma
>>>x=ma.array([1,2,3,4,5],mask=[1,0,0,1,0])
>>>y=ma.array(x)
>>>y
masked_array(data = [-- 2 3 -- 5],
      mask = [ True False False  True False],
      fill_value=999999)

We just inherited the mask from x: y._mask and x._mask are the same object, 
and y._sharedmask is True. Now, let's change keep_mask to False

>>>y=ma.array(x,keep_mask=False)
>>>y
masked_array(data = [1 2 3 4 5],
      mask = False,
      fill_value=999999)
We keep the data from x, but we force the mask to the default (viz, nomask)
Now for some more fun: remember that we keep the mask by defulat

>>>y=ma.array(x,mask=[0,0,0,0,1])
>>>y
masked_array(data = [-- 2 3 -- --],
      mask = [ True False False  True  True],
      fill_value=999999)

We kept the mask of x ([1,0,0,1,0]) and combined it with our new mask 
([0,0,0,0,1]), so y._mask=[1,0,0,1,1]
If you really want [0,0,0,0,1] as a mask, just drop the initial mask:
>>>y=ma.array(x,mask=[0,0,0,0,1], keep_mask=False)
>>>y
masked_array(data = [1 2 3 4 --],
      mask = [False False False False  True],
      fill_value=999999)

> At the very least, perhaps the doc-string 
> should be amended to include the note that if a mask is provided, it
> is assumed to be shared and a copy of it will be made when/if it is
> modified.
Sounds like a good idea. is there a wiki page for MaskedArrays somewhere ? If 
not, Alexander, feel free to start one from your experience, I'll update if 
needed.