[Numpy-discussion] alterNEP - was: missing data discussion round 2

Nathaniel Smith njs at pobox.com
Sat Jul 2 00:03:48 EDT 2011


On Fri, Jul 1, 2011 at 9:29 AM, Benjamin Root <ben.root at ou.edu> wrote:
> On Fri, Jul 1, 2011 at 11:20 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>> On Fri, Jul 1, 2011 at 5:17 PM, Benjamin Root <ben.root at ou.edu> wrote:
>> > For more complicated functions like pcolor() and contour(), the arrays
>> > needs
>> > to know what the status of the neighboring points in itself, and for the
>> > other arrays.  Again, either we use numpy.ma to share a common mask
>> > across
>> > the data arrays, or we implement our own semantics to deal with this.
>> > And
>> > again, we can not change any of the original data.
>> >
>> > This is not an obscure case.  This is existing code in matplotlib.  I
>> > will
>> > be evaluating the current missingdata branch later today to assess its
>> > suitability for use in matplotlib.
>>
>> I think I missed why your case needs NA and IGNORE to use the same
>> API.  Why can't you just use masks and IGNORE here?
>
> The point is that matplotlib can not make assumptions about the nature of
> the input data.  From matplotlib's perspective, NA's and IGNORE's are the
> same thing and should be treated the same way (i.e. - skipped).  Right now,
> matplotlib's code is messy and inconsistent with its treatment of masked
> arrays and NaNs (some functions treat them the same, some only apply to NaNs
> and vice versa).  This is because of code cruft over the years.  If we had
> one interface to rule them all, we can bring *all* plotting functions to
> have similar handling code and be more consistent across the board.

Maybe I'm missing something, but it seems like no matter how the NA
handling thing plays out, what you need is something like

# For current numpy:
def usable_points(a):
    a = np.asanyarray(a)
    usable = ~np.isnan(a)
    usable &= ~np.isinf(a)
    if isinstance(a, np.ma.masked_array):
        usable &= ~a.mask
    return usable

def all_usable(a, *rest):
    usable = usable_points(a)
    for other in rest:
        usable &= usable_points(other)
    return usable

And then you need to call all_usable from each of your plotting
functions and away you go, yes?

AFAICT, under the NEP proposal, in usable_points() you need to add a line like:
  usable &= ~np.isna(a)  # NEP

Under the alterNEP proposal, you need to add two lines, like
  usable &= ~np.isna(a)  # alterNEP
  usable &= a.visible    # alterNEP

And either way, once you get your mask, you pretty much do the same
thing: either use it directly, or use it to set up a masked array (of
whatever flavor, and they all seem to work the same as far as this is
concerned).

You seem to see some way in which the alterNEP's separation of masks
and NA handling makes a big difference to your architecture, but I'm
not getting it :-(.

-- Nathaniel



More information about the NumPy-Discussion mailing list