[Numpy-discussion] Missing data wrap-up and request for comments

Travis Oliphant travis at continuum.io
Wed May 9 15:35:26 EDT 2012


> My three proposals: 
> 
> 	* do nothing and leave things as is 
> 
> 	* add a global flag that turns off masked array support by default but otherwise leaves things unchanged (I'm still unclear how this would work exactly)
> 
> 	* move Mark's "masked ndarray objects" into a new fundamental type (ndmasked), leaving the actual ndarray type unchanged.  The array_interface keeps the masked array notions and the ufuncs keep the ability to handle arrays like ndmasked.    Ideally, numpy.ma would be changed to use ndmasked objects as their core. 
> 
> 
> The numpy.ma is unmaintained and I don't see that changing anytime soon. As you know, I would prefer 1), but 2) is a good compromise and the infra structure for such a flag could be useful for other things, although like yourself I'm not sure how it would be implemented. I don't understand your proposal for 3), but from the description I don't see that it buys anything.

That is a bit strong to call numpy.ma unmaintained.    I don't consider it that way.    Are there a lot of tickets for it that are unaddressed?   Is it broken?   I know it gets a lot of use in the wild and so I don't think NumPy users would be happy to here it is considered unmaintained by NumPy developers.     

I'm looking forward to more details of Mark's proposal for #2. 

The proposal for #3 is quite simple and I think it is also a good compromise between removing the masked array entirely from the core NumPy object and leaving things as is in master.  It keeps the functionality (but in a separate object) much like numpy.ma is a separate object.   Basically it buys not forcing *all* NumPy users (on the C-API level) to now deal with a masked array.    I know this push is a feature that is part of Mark's intention (as it pushes downstream libraries to think about missing data at a fundamental level).    But, I think this is too big of a change to put in a 1.X release.   The internal array-model used by NumPy is used quite extensively in downstream libraries as a *concept*.  Many people have enhanced this model with a separate mask array for various reasons, and Mark's current use of mask does not satisfy all those use-cases.   I don't see how we can justify changing the NumPy 1.X memory model under these circumstances. 

This is the sort of change that in my mind is a NumPy 2.0 kind of change where downstream users will be looking for possible array-model changes.  

-Travis





>  
> For the record, I'm currently in favor of the third proposal.   Feel free to comment on these proposals (or provide your own). 
> 
> 
> Chuck 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120509/5d31f13b/attachment.html>


More information about the NumPy-Discussion mailing list