[Numpy-discussion] alterNEP - was: missing data discussion round 2

Matthew Brett matthew.brett at gmail.com
Fri Jul 1 12:20:45 EDT 2011


Hi,

On Fri, Jul 1, 2011 at 5:17 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
>
> On Fri, Jul 1, 2011 at 11:00 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> > You can't switch between the two approaches without big changes in your
>> > code.
>>
>> >
>> Lluis provided a case, and it was obscure.  That switch seems like a
>> rare or non-existent use-case that should not guide the API.
>>
>
> Just to respond to this specific issue.
>
> In matplotlib, there are often constructs like the following:
>
> plot_something(X, Y, V)
>
> From a module perspective, we have no clue about the nature of the input
> data.  We often have to do things like np.asanyarray, np.atleast_2d and such
> to establish some base-level assumptions about the input data.  Numpy
> currently makes this fairly cheap by not performing a copy if it is not
> needed.  So far, so good.
>
> Next, some plotting functions needs to broadcast the arrays together (again,
> numpy makes that fairly cheap).
>
> Then, we need to figure out the common elements to plot.  With something
> simple like plot(), this is straight-forward or-ing of any masks.  Of
> course, right now, this is not cheap because we can't assume that the array
> supports masking semantics.  This is where we either cast the arrays as
> masked arrays, or perform our own masking semantics.  But, essentially, a
> point that was masked in X, may not be masked in Y and/or V, and we can not
> change the original data (or else we would be a bad tool).
>
> For more complicated functions like pcolor() and contour(), the arrays needs
> to know what the status of the neighboring points in itself, and for the
> other arrays.  Again, either we use numpy.ma to share a common mask across
> the data arrays, or we implement our own semantics to deal with this.  And
> again, we can not change any of the original data.
>
> This is not an obscure case.  This is existing code in matplotlib.  I will
> be evaluating the current missingdata branch later today to assess its
> suitability for use in matplotlib.

I think I missed why your case needs NA and IGNORE to use the same
API.  Why can't you just use masks and IGNORE here?

Best,

Matthew



More information about the NumPy-Discussion mailing list