[Numpy-discussion] Missing data wrap-up and request for comments
Travis Oliphant
travis at continuum.io
Wed May 9 15:35:26 EDT 2012
> My three proposals:
>
> * do nothing and leave things as is
>
> * add a global flag that turns off masked array support by default but otherwise leaves things unchanged (I'm still unclear how this would work exactly)
>
> * move Mark's "masked ndarray objects" into a new fundamental type (ndmasked), leaving the actual ndarray type unchanged. The array_interface keeps the masked array notions and the ufuncs keep the ability to handle arrays like ndmasked. Ideally, numpy.ma would be changed to use ndmasked objects as their core.
>
>
> The numpy.ma is unmaintained and I don't see that changing anytime soon. As you know, I would prefer 1), but 2) is a good compromise and the infra structure for such a flag could be useful for other things, although like yourself I'm not sure how it would be implemented. I don't understand your proposal for 3), but from the description I don't see that it buys anything.
That is a bit strong to call numpy.ma unmaintained. I don't consider it that way. Are there a lot of tickets for it that are unaddressed? Is it broken? I know it gets a lot of use in the wild and so I don't think NumPy users would be happy to here it is considered unmaintained by NumPy developers.
I'm looking forward to more details of Mark's proposal for #2.
The proposal for #3 is quite simple and I think it is also a good compromise between removing the masked array entirely from the core NumPy object and leaving things as is in master. It keeps the functionality (but in a separate object) much like numpy.ma is a separate object. Basically it buys not forcing *all* NumPy users (on the C-API level) to now deal with a masked array. I know this push is a feature that is part of Mark's intention (as it pushes downstream libraries to think about missing data at a fundamental level). But, I think this is too big of a change to put in a 1.X release. The internal array-model used by NumPy is used quite extensively in downstream libraries as a *concept*. Many people have enhanced this model with a separate mask array for various reasons, and Mark's current use of mask does not satisfy all those use-cases. I don't see how we can justify changing the NumPy 1.X memory model under these circumstances.
This is the sort of change that in my mind is a NumPy 2.0 kind of change where downstream users will be looking for possible array-model changes.
-Travis
>
> For the record, I'm currently in favor of the third proposal. Feel free to comment on these proposals (or provide your own).
>
>
> Chuck
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120509/5d31f13b/attachment.html>
More information about the NumPy-Discussion
mailing list