On Fri, Jun 14, 2013 at 1:22 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Jun 12, 2013 at 7:43 PM, Eric Firing <efiring@hawaii.edu> wrote:
> On 2013/06/12 2:10 AM, Nathaniel Smith wrote:
>> Personally I think that overloading np.empty is horribly ugly, will
>> continue confusing newbies and everyone else indefinitely, and I'm
>> 100% convinced that we'll regret implementing such a warty interface
>> for something that should be so idiomatic. (Unfortunately I got busy
>> and didn't actually say this in the previous thread though.) So I
>> think we should just merge the PR as is. The only downside is the
>> np.ma inconsistency, but, np.ma is already inconsistent (cf.
>> masked_array.fill versus masked_array.filled!), somewhat deprecated,
>
> "somewhat deprecated"?  Really?  Since when?  By whom?  Replaced by what?

Sorry, not trying to start a fight, just trying to summarize the
situation. As far as I can tell:


Oh... (puts away iron knuckles)
 
Despite heroic efforts on the part of its authors, numpy.ma has a
number of weird quirks (masked data can still trigger invalid value
errors), misfeatures (hard versus soft masks), and just plain old pain
points (ongoing issues with whether any given operation will respect
or preserve the mask).

Actually, now that we have a context manager for warning capture, we could actually fix that.
 

It's been in deep maintenance mode for some time; we merge the
occasional bug fix that people send in, and that's it. (To be fair,
numpy as a whole is fairly slow-moving, but numpy.ma still gets much
less attention.)

Even if there were active maintainers, no-one really has any idea how
to fix any of the problems above; they're not so much bugs as
intrinsic limitations of the design.
 
Therefore, my impression is that a majority (not all, but a majority)
of numpy developers strongly recommend against the use of numpy.ma in
new projects.


Such a recommendation should be in writing in the documentation and elsewhere.  Furthermore, a proper replacement would also be needed.  Just simiply deprecating it without some sort of decent alternative leaves everybody in a lurch.  I have high hopes for NA to be that replacement, and the sooner, the better.
 
I could be wrong! And I know there's nothing to really replace it. I'd
like to fix that. But I think "semi-deprecated" is not an unfair
shorthand for the above.


You will have to pry np.ma from my cold, dead hands!  (or distract me with a sufficiently shiny alternative)

 
(I'll even admit that I'd *like* to actually deprecate it. But what I
mean by that is, I don't think it's possible to fix it to the point
where it's actually a solid/clean/robust library, so I'd like to reach
a point where everyone who's currently using it is happier switching
to something else and is happy to sign off on deprecating it.)

As far as many people are concerned, it is a solid, clean, robust library.
 

>> and AFAICT there are far more people who will benefit from a clean
>> np.filled idiom than who actually use np.ma (and in particular its
>> fill-value functionality). So there would be two
>
> I think there are more np.ma users than you realize.  Everyone who uses
> matplotlib is using np.ma at least implicitly, if not explicitly.  Many
> of the matplotlib examples put np.ma to good use.  np.ma.filled is an
> essential long-standing part of the np.ma API.  I don't see any good
> rationale for generating a conflict with it, when an adequate
> non-conflicting alternative ('np.initialized', maybe others) exists.

I'm aware of that. If I didn't care about the opinions of numpy.ma
users, I wouldn't go starting long and annoying mailing list threads
about features that are only problematic because of their affect on
numpy.ma :-).

But, IMHO given the issues with numpy.ma, our number #1 priority ought
to be making numpy proper as clean and beautiful as possible; my
position that started this thread is basically just that we shouldn't
make numpy proper worse just for numpy.ma's sake. That's the tail
wagging the dog. And this 'conflict' seems a bit overstated given that
(1) np.ma.filled already has multiple names (and 3/4 of the uses in
matplotlib use the method version, not the function version), (2) even
if we give it a non-conflicting name, np.ma's lack of maintenance
means that it'd probably be years before someone got around to
actually adding a parallel function to np.ma. [Unless this thread
spurs someone into submitting one just to prove me wrong ;-).]


Actually, IIRC, np.ma does some sort of auto-wrapping of numpy functions.  This is why adding np.filled() would cause a namespace clobbering, I think.

Cheers!
Ben Root