Thank you for provoking me to think about these issues in MA. Here is the conclusion I have reached. Please let me know what you think of it. Background: Michael wanted a way to use a masked array as a Numeric array but with assurance that in fact no element was masked, without obscure tests such as count(x) == product(x.shape). The method __array__(self, typecode=None) is a special (existing) hook for conversion to a Numeric array. Many operations in Numeric, when presented with an object x to be operated upon, such as Numeric.sqrt(x), will call x.__array__ as a final act of desperation in an attempt to convert their argument to a Numeric array. Heretofore it was essentially returning x.filled(). This bothered me, because it was a silent conversion that replaced masked values with the fill value. Solution: a. Add a method 'unmask()' which will replace the mask by None if possible. It will not fail. b. Change MaskedArray.__array__ to work as follows: a. self.unmask(), and then b. Return the raw data if the mask is now None. Otherwise, throw an MAError. Example usage:
from MA import * x=arange(10) Numeric.array(x) [0,1,2,3,4,5,6,7,8,9,] x[3]=masked Numeric.array(x) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/pcmdi/dubois/linux/lib/python2.1/site-packages/MA/MA.py", line 578, in __array__ raise MAError, \ MA.MA.MAError: Cannot convert masked array to Numeric because data is masked in one or more locations.
Merits of this solution: a. It reads like what it is -- there is no doubt you are converting to a Numeric array when you see Numeric.array. b. It gives you the full range of options in Numeric.array, such as messing with the typecode. c. It allows Numeric operations for speed on masked arrays that you know to be masked in name only. No copy of data occurs here unless the typecode needs to be changed. d. It removes the possibility of a 'dishonest' conversion. e. No new method or function is required, other than the otherwise-useful unmask(). f. Successive conversions are optimized because once the mask is None, unmask is cheap. Deficiency: __array__ becomes a query with an internal, albeit safe, side-effect. Mitigating this is that __array__ is not a "public" method and would not normally be used in assertions.