
Hi, I've spent several days using the masked arrays that have been added to NumPy recently. They're a great feature and they were just what I needed for the little project I was working on (aside from a few bugs that I found). However, there were a few things about MA that I found inconvenient and/or counterintuitive, so I thought I'd post them to the list while they're fresh in my mind. I'm using Numeric-20.0.0b2. 1. I couldn't find a simple way to tell if all of the cells of a masked array are unmasked. There are times when you fill an array incrementally and you want to convert it to a Numeric array but first make sure that all of the elements have been set. "m.filled()" is a bit dangerous (in my opinion) because it silently fills. The shortest idiom I could think of is >>> assert not logical_or.reduce(ravel(MA.getmaskarray(m))) which isn't very short :-) and is also awkward because it creates a mask array even if m.mask() is None. How about a m.is_unmasked() method, or even giving a special meaning to "m.filled(masked)", namely that it raises an exception if any cells are still masked. (As an optimization, this method could set m.__mask = None to speed up future checks.) 2. I can't reproduce this problem now, but I could swear that the MaskedArray.__str__() method sometimes printed "typecode='O'" if masked.enabled() is true. This would be a byproduct of using Numeric's __str__() method to print the array, at least under the unknown circumstances in which Numeric.__str__() prints the typecode. This confused me for a while. 3. I found the semantics of MA.compress(condition,a,axis=0) to be inconvenient and inconsistent with those of Numeric.compress. MA.compress() squeezes out not only those elements for which condition is false, but also those elements that are masked. This differs from the behavior of Numeric.compress, which always returns an array with the "axis" dimension equal to the number of nonzero elements of "condition". The real problem, though, is that MA.compress can't be used on a multidimensional array with a nontrivial mask, because squeezing out the masked values is highly unlikely to result in a rectangular matrix. It is nice to be able to squeeze masked values out of a 1-d array, but not at the price of not being able to use compress on a multidimensional array. I suggest giving MA.compress() semantics closer to those of Numeric.compress(), and adding an optional argument or a separate method to cause masked elements to be omitted. Thanks for a great package! Yours, Michael -- Michael Haggerty mhagger@alum.mit.edu