
Hi Tom, I think a sensible alternative mental model for the MaskedArray class is
that all it does is forward any operations to the data it holds and separately propagate a mask,
I'm generally on-board with that mental picture, and agree that the use-case described by Ben (different layers of satellite imagery) is important. Same thing happens in astronomy data, e.g. you have a CCD image of the sky and there are cosmic rays that contaminate the image. Those are not garbage data, just pixels that one wants to ignore in some, but not all, contexts.
However, it's worth noting that one cannot blindly forward any operations to the data it holds since the operation may be illegal on that data. The simplest example is dividing `a / b` where `b` has data values of 0 but they are masked. That operation should succeed with no exception, and here the resultant value under the mask is genuinely garbage.
Even in the present implementation, the operation is just forwarded, with numpy errstate set to ignore all errors. And then after the fact some half-hearted remediation is done.
The current MaskedArray seems a bit inconsistent in dealing with invalid calcuations. Dividing by 0 (if masked) is no problem and returns the numerator. Taking the log of a masked 0 gives the usual divide by zero RuntimeWarning and puts a 1.0 under the mask of the output.
Perhaps the expression should not even be evaluated on elements where the output mask is True, and all the masked output data values should be set to a predictable value (e.g. zero for numerical, zero-length string for string, or maybe a default fill value). That at least provides consistent and predictable behavior that is simple to explain. Otherwise the story is that the data under the mask *might* be OK, unless for a particular element the computation was invalid in which case it is filled with some arbitrary value. I think that is actually an error-prone behavior that should be avoided.
I think I agree with Allan here, that after a computation, one generally simply cannot safely assume anything for masked elements. But it is reasonable for subclasses to define what they want to do "post-operation"; e.g., for numerical arrays, it might make generally make sense to do ``` notok = ~np.isfinite(result) mask |= notok ``` and one could then also do ``` result[notok] = fill_value ``` But I think one might want to leave that to the user. All the best, Marten