[Numpy-discussion] read-only or immutable masked array

Pierre Gerard-Marchant pgmdevlist at gmail.com
Mon Jul 15 11:25:17 EDT 2013


On Jul 15, 2013, at 14:40 , Gregorio Bastardo <gregorio.bastardo at gmail.com> wrote:

> Hi Pierre,
> 
>> Note as well that hardening the mask only prevents unmasking: you can still grow the mask, which may not be what you want. Use `x.mask.flags.writeable=False` to make the mask really read-only.
> 
> I ran into an unmasking problem with the suggested approach:
> 
>>>> np.version.version
> '1.7.0'
>>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>>> x
> masked_array(data = [0 -- 2 --],
>             mask = [False  True False  True],
>       fill_value = 999999)
>>>> x.flags.writeable = False
>>>> x.mask.flags.writeable = False
>>>> x.mask[1] = 0 # ok
> Traceback (most recent call last):
>  ...
> ValueError: assignment destination is read-only
>>>> x[1] = 0 # ok
> Traceback (most recent call last):
>  ...
> ValueError: assignment destination is read-only
>>>> x.mask[1] = 0 # ??
>>>> x
> masked_array(data = [0 1 2 --],
>             mask = [False False False  True],
>       fill_value = 999999)

Ouch…
Quick workaround:  use `x.harden_mask()` *then* `x.mask.flags.writeable=False`

[Longer explanation]
> I noticed that "sharedmask" attribute changes (from True to False)
> after "x[1] = 0".

Indeed, indeed… When setting items, the mask is unshared to limit some issues (like propagation to the other masked_arrays sharing the mask). Unsharing the mask involves a copy, which unfortunately doesn't copy the flags. In other terms, when you try `x[1]=0`, the mask becomes rewritable. That hurts…
But! This call to `unshare_mask` is performed only when the mask is 'soft' hence the quick workaround…

Note to self (or whomever will fix the issue before I can do it):
* We could make sure that copying a mask copies some of its flags to (like the `writeable` one, which other ones?)
* The call to `unshare_mask` is made *before* we try to call `__setitem__` on the `_data` part: that's silly, if we called `__setitem__(_data,index,dval)` before, the `ValueError: assignment destination is read-only` would be raised before the mask could get unshared… TLD;DR: move L3073 of np.ma.core to L3068
* There should be some simpler ways to make a masked_array read-only, this little dance is rapidly tiring.





> Also, some of the ma operations result mask identity
> of the new ma, which causes ValueError when the new ma mask is
> modified:
> 
>>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>>> x.flags.writeable = False
>>>> x.mask.flags.writeable = False
>>>> x1 = x > 0
>>>> x1.mask is x.mask # ok
> False
>>>> x2 = x != 0
>>>> x2.mask is x.mask # ??
> True
>>>> x2.mask[1] = 0
> Traceback (most recent call last):
>  ...
> ValueError: assignment destination is read-only
> 
> which is a bit confusing.

Ouch again. 
[TL;DR] No workaround, sorry
[Long version]
The inconsistency comes from the fact that '!=' or '==' call the `__ne__` or `__eq__` methods while other comparison operators call their own function. In the first case, because we're comparing with a non-masked scalar, no copy of the mask is made; in the second case, a copy is systematically made. As pointed out earlier, copies of a mask don't preserve its flags…
[Note to self]
* Define a factory for __lt__/__le__/__gt__/__ge__ based on __eq__ : MaskedArray.__eq__ and __ne__ already have almost the same code.. (but what about filling? Is it an issue?)



> And I experienced that *_like operations
> give mask identity too:
> 
>>>> y = np.ones_like(x)
>>>> y.mask is x.mask
> True

This may change in the future, depending on a yet-to-be-achieved consensus on  the definition of 'least-surprising behaviour'. Right now, the *-like functions return an array that shares the mask with the input, as you've noticed. Some people complained about it, what's your take on that?

> I might be missing something but could you clarify these issues?

You were not missing anything, np.ma isn't the most straightforward module: plenty of corner cases, and the implementation is pretty naive at times (but hey, it works). My only advice is to never lose hope.





More information about the NumPy-Discussion mailing list