Re: [Numpy-discussion] new MaskedArray class

June 24, 2019

      Hi Allan,
...
The alternative solution in my model would be to replace `np.dot` with a
...
masked-specific implementation of what `np.dot` is supposed to stand for
(in your simple example, `np.add.reduce(np.multiply(m, m))` - more
generally, add relevant `outer` and `axes`). This would be similar to
what I think all implementations do for `.mean()` - we cannot calculate
that from the data using any fill value or skipping, so rather use a
more easily cared-for `.sum()` and divide by a suitable number of
elements. But in both examples the disadvantage is that we took away the
option to use the underlying class's `.dot()` or `.mean()`
implementations.
Just to note, my current implementation uses the IGNORE style of mask,
so does not propagate the mask in np.dot:
>>> a = MaskedArray([[1,1,1], [1,X,1], [1,1,1]])
    >>> np.dot(a, a)
MaskedArray([[3, 2, 3],
                 [2, 2, 2],
                 [3, 2, 3]])
I'm not at all set on that behavior and we can do something else. For
now, I chose this way since it seemed to best match the "IGNORE" mask
behavior.
It is a nice example, I think. In terms of action on the data, one would
get this result as well in my pseudo-representation of
`np.add.reduce(np.multiply(m, m))` - as long as the multiply is taken as
outer product along the relevant axes (which does not ignore the mask,
i.e., if either element is masked, the product is too), and subsequently a
sum which works like other reductions and skips masked elements.
...
From the FFT array multiplication analogy, though, it is not clear that,
effectively, replacing masked elements by 0 is reasonable.
Equivalently, thinking of `np.dot` in its 1-D form as presenting the length
of the projection of one vector along another, it is not clear what a
single masked element is supposed to mean. In a way, masking just one
element of a vector or of a matrix makes vector or matrix operations
meaningless.

I thought fitting data with a mask might give a counterexample, but in that
one usually calculates at some point r = y - A x, so no masking of the
matrix, and subtraction y-Ax passing on a mask, and summing of r ignoring
masked elements does just the right thing.

All the best,

Marten

Re: [Numpy-discussion] new MaskedArray class

Marten van Kerkwijk