[Numpy-discussion] Masked arrays: Rationale for "False convention"
ondrej.certik at gmail.com
Tue Oct 1 13:23:43 EDT 2013
On Tue, Oct 1, 2013 at 4:29 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Tue, Oct 1, 2013 at 3:57 AM, Ondřej Čertík <ondrej.certik at gmail.com>
>> I see, that makes sense. So to remember this, the rule is:
>> "Specify elements that you want to get masked using True in 'mask'".
> Yes. This convention dates back at least to the original MA package in
> Numeric; I don't know if Paul Dubois stole it from any previous software.
I see, thanks.
> One way to motivate the convention is to think about doing a binary
> operation on masked arrays, which is really the most common kind of thing
> one does with masked arrays. The mask of the result is the logical OR of the
> two operand masks (barring additional masked elements from domain
> violations, 0/0, etc.).
In the other convention, you just use logical AND, so that seams equally
simple, unless I am missing something.
> I assume that the convention was decided mostly on
> what was most convenient and efficient for the common internal operations
> for *implementing* the masked arrays and not necessarily matching any
> particular intuitions when putting data *into* the masked arrays.
That makes sense.
On Mon, Sep 30, 2013 at 9:05 PM, Eric Firing <efiring at hawaii.edu> wrote:
> On 2013/09/30 4:57 PM, Ondřej Čertík wrote:
>> But why do I need to invert the mask when I want to see the valid elements:
>> In : from numpy import ma
>> In : a = ma.array([1, 2, 3, 4], mask=[False, False, True, False])
>> In : a
>> masked_array(data = [1 2 -- 4],
>> mask = [False False True False],
>> fill_value = 999999)
>> In : a[~a.mask]
>> masked_array(data = [1 2 4],
>> mask = [False False False],
>> fill_value = 999999)
>> I would find natural to write  as a[a.mask]. This is when it gets confusing.
> There is no getting around it; each of the two possible conventions has
> its advantages. But try this instead:
> In : a = ma.array([1, 2, 3, 4], mask=[False, False, True, False])
> In : a.compressed()
> Out: array([1, 2, 4])
> I do occasionally need a "goodmask" which is the inverse of a.mask, but
> not very often; and when I do, needing to invert a.mask doesn't bother me.
a.compressed() works for getting data out --- but I also use it to
assign data in,
a[~a.mask] = 1
Thanks everybody for the discussion. It sheds some light onto the current
More information about the NumPy-Discussion