[Numpy-discussion] MaskedArray.__array__ bug? (was 'A reimplementation of MaskedArray')

Pierre GM pgmdevlist at gmail.com
Mon Nov 20 00:31:47 EST 2006


> Is there consensus that in __array__ it is incorrect to return the
> _data component of a MaskedArray when there are masked values? It
> certainly worries me that what is underneath the mask is returned.

When is __array__ really used in practice ? Most of the time, isn't  the 
conversion to ndarray done with numpy.core.numeric.asarray ? Which doesn't 
seem to call the __array__ method...

> Alternatively, there could 
> be subclass of masked array for uses in which the masked data
> represents missing data (hence data under the mask is spurious and
> should not be exposed). In this subclass the __array__ method could be
> redefined to impose stricter control of the masked data.
Sounds like a good idea, but we're back on question #1.

I think that one issue is that N.asarray, N.array, etc. access the _data part 
directly, when in fact, they should ideally access the filled version. A 
workaround is to ensure that your _data has some 'well-defined' missing data, 
by filling them beforehand with something like nan (which won't work if you 
work with ints, though...). But that gets heavy.




More information about the NumPy-Discussion mailing list