[Numpy-discussion] Thoughts on masked arrays

Paul F. Dubois paul at pfdubois.com
Wed May 9 22:56:16 EDT 2001


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of
Michael Haggerty wrote
1. I couldn't find a simple way to tell if all of the cells of a
   masked array are unmasked.  There are times when you fill an array
   incrementally and you want to convert it to a Numeric array but
   first make sure that all of the elements have been set.
   "m.filled()" is a bit dangerous (in my opinion) because it silently
   fills.  The shortest idiom I could think of is

    >>> assert not logical_or.reduce(ravel(MA.getmaskarray(m)))

   which isn't very short :-) and is also awkward because it creates a
   mask array even if m.mask() is None.  How about a m.is_unmasked()
   method, or even giving a special meaning to "m.filled(masked)",
   namely that it raises an exception if any cells are still masked.
   (As an optimization, this method could set m.__mask = None to speed
   up future checks.)
======
>>> from MA import *
>>> x=array([[1,2],[3,4]],mask=[[0,0],[0,0]])
>>> count(x)
4
>>> product(x.shape)
4
>>>
So your test could be if count(x) < product(x.shape): error...

make_mask(m, flag=1) will make a mask and have it be None if possible. It
also accepts an argument of None correctly.
So your test could be
   if make_mask(m.mask(),flag=1) is not None:
       error...

You could also consider if not Numeric.allclose(m.filled(0), m.filled(1))
or
m.mask() is not None and not Numeric.alltrue(Numeric.ravel(m.mask())):

Is that enough ways to do it? (TM) (:->

I don't recommend using assert if the test is data-driven, since it won't
get executed with python -O. Instead use if...: raise ....

I'm not against is_unmasked but I'm not sure how much it would get used and
I don't like the name. I hate query methods with side effects (if you use
them in an assert you change the program).

A method that replaces the mask with None if possible might make sense.
m.unmask()? m.demask()? m.debride() ?


=============

2. I can't reproduce this problem now, but I could swear that the
   MaskedArray.__str__() method sometimes printed "typecode='O'" if
   masked.enabled() is true.  This would be a byproduct of using
   Numeric's __str__() method to print the array, at least under the
   unknown circumstances in which Numeric.__str__() prints the
   typecode.  This confused me for a while.

=========
Short of writing my own print routine, I basically have to create something
filled with '--', which is of course of type object. That's why you can
disable it. (:->
=========

3. I found the semantics of MA.compress(condition,a,axis=0) to be
   inconvenient and inconsistent with those of Numeric.compress.
   MA.compress() squeezes out not only those elements for which
   condition is false, but also those elements that are masked.  This
   differs from the behavior of Numeric.compress, which always returns
   an array with the "axis" dimension equal to the number of nonzero
   elements of "condition".  The real problem, though, is that
   MA.compress can't be used on a multidimensional array with a
   nontrivial mask, because squeezing out the masked values is highly
   unlikely to result in a rectangular matrix.  It is nice to be able
   to squeeze masked values out of a 1-d array, but not at the price
   of not being able to use compress on a multidimensional array.  I
   suggest giving MA.compress() semantics closer to those of
   Numeric.compress(), and adding an optional argument or a separate
   method to cause masked elements to be omitted.
======
It has been an interesting project in that there are hundreds of these
individual little design questions.
Can you propose the semantics you would like in a precise way? Include the
case where the condition has masked values.
======

Thanks for a great package!

Yours,
Michael

===
I appreciate the encouragement. -- Paul





More information about the NumPy-Discussion mailing list