[Numpy-discussion] Thoughts on masked arrays
Paul F. Dubois
paul at pfdubois.com
Wed May 9 22:56:16 EDT 2001
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of
Michael Haggerty wrote
1. I couldn't find a simple way to tell if all of the cells of a
masked array are unmasked. There are times when you fill an array
incrementally and you want to convert it to a Numeric array but
first make sure that all of the elements have been set.
"m.filled()" is a bit dangerous (in my opinion) because it silently
fills. The shortest idiom I could think of is
>>> assert not logical_or.reduce(ravel(MA.getmaskarray(m)))
which isn't very short :-) and is also awkward because it creates a
mask array even if m.mask() is None. How about a m.is_unmasked()
method, or even giving a special meaning to "m.filled(masked)",
namely that it raises an exception if any cells are still masked.
(As an optimization, this method could set m.__mask = None to speed
up future checks.)
>>> from MA import *
So your test could be if count(x) < product(x.shape): error...
make_mask(m, flag=1) will make a mask and have it be None if possible. It
also accepts an argument of None correctly.
So your test could be
if make_mask(m.mask(),flag=1) is not None:
You could also consider if not Numeric.allclose(m.filled(0), m.filled(1))
m.mask() is not None and not Numeric.alltrue(Numeric.ravel(m.mask())):
Is that enough ways to do it? (TM) (:->
I don't recommend using assert if the test is data-driven, since it won't
get executed with python -O. Instead use if...: raise ....
I'm not against is_unmasked but I'm not sure how much it would get used and
I don't like the name. I hate query methods with side effects (if you use
them in an assert you change the program).
A method that replaces the mask with None if possible might make sense.
m.unmask()? m.demask()? m.debride() ?
2. I can't reproduce this problem now, but I could swear that the
MaskedArray.__str__() method sometimes printed "typecode='O'" if
masked.enabled() is true. This would be a byproduct of using
Numeric's __str__() method to print the array, at least under the
unknown circumstances in which Numeric.__str__() prints the
typecode. This confused me for a while.
Short of writing my own print routine, I basically have to create something
filled with '--', which is of course of type object. That's why you can
disable it. (:->
3. I found the semantics of MA.compress(condition,a,axis=0) to be
inconvenient and inconsistent with those of Numeric.compress.
MA.compress() squeezes out not only those elements for which
condition is false, but also those elements that are masked. This
differs from the behavior of Numeric.compress, which always returns
an array with the "axis" dimension equal to the number of nonzero
elements of "condition". The real problem, though, is that
MA.compress can't be used on a multidimensional array with a
nontrivial mask, because squeezing out the masked values is highly
unlikely to result in a rectangular matrix. It is nice to be able
to squeeze masked values out of a 1-d array, but not at the price
of not being able to use compress on a multidimensional array. I
suggest giving MA.compress() semantics closer to those of
Numeric.compress(), and adding an optional argument or a separate
method to cause masked elements to be omitted.
It has been an interesting project in that there are hundreds of these
individual little design questions.
Can you propose the semantics you would like in a precise way? Include the
case where the condition has masked values.
Thanks for a great package!
I appreciate the encouragement. -- Paul
More information about the NumPy-Discussion