![](https://secure.gravatar.com/avatar/09939f25b639512a537ce2c90f77f958.jpg?s=120&d=mm&r=g)
On Fri, Jul 1, 2011 at 11:20 AM, Matthew Brett <matthew.brett@gmail.com>wrote:
Hi,
On Fri, Jul 1, 2011 at 5:17 PM, Benjamin Root <ben.root@ou.edu> wrote:
On Fri, Jul 1, 2011 at 11:00 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
You can't switch between the two approaches without big changes in
your
code.
Lluis provided a case, and it was obscure. That switch seems like a rare or non-existent use-case that should not guide the API.
Just to respond to this specific issue.
In matplotlib, there are often constructs like the following:
plot_something(X, Y, V)
From a module perspective, we have no clue about the nature of the input data. We often have to do things like np.asanyarray, np.atleast_2d and such to establish some base-level assumptions about the input data. Numpy currently makes this fairly cheap by not performing a copy if it is not needed. So far, so good.
Next, some plotting functions needs to broadcast the arrays together (again, numpy makes that fairly cheap).
Then, we need to figure out the common elements to plot. With something simple like plot(), this is straight-forward or-ing of any masks. Of course, right now, this is not cheap because we can't assume that the array supports masking semantics. This is where we either cast the arrays as masked arrays, or perform our own masking semantics. But, essentially, a point that was masked in X, may not be masked in Y and/or V, and we can not change the original data (or else we would be a bad tool).
For more complicated functions like pcolor() and contour(), the arrays needs to know what the status of the neighboring points in itself, and for the other arrays. Again, either we use numpy.ma to share a common mask across the data arrays, or we implement our own semantics to deal with this. And again, we can not change any of the original data.
This is not an obscure case. This is existing code in matplotlib. I will be evaluating the current missingdata branch later today to assess its suitability for use in matplotlib.
I think I missed why your case needs NA and IGNORE to use the same API. Why can't you just use masks and IGNORE here?
Best,
Matthew
The point is that matplotlib can not make assumptions about the nature of the input data. From matplotlib's perspective, NA's and IGNORE's are the same thing and should be treated the same way (i.e. - skipped). Right now, matplotlib's code is messy and inconsistent with its treatment of masked arrays and NaNs (some functions treat them the same, some only apply to NaNs and vice versa). This is because of code cruft over the years. If we had one interface to rule them all, we can bring *all* plotting functions to have similar handling code and be more consistent across the board. However, I think Mark's NEP provides a good way to distinguish between the cases when needed (but I have not examined it from that perspective yet). Ben Root