[Numpy-discussion] Missing data wrap-up and request for comments
Charles R Harris
charlesr.harris at gmail.com
Wed May 9 16:06:27 EDT 2012
On Wed, May 9, 2012 at 1:35 PM, Travis Oliphant <travis at continuum.io> wrote:
> My three proposals:
>> * do nothing and leave things as is
>> * add a global flag that turns off masked array support by default but
>> otherwise leaves things unchanged (I'm still unclear how this would work
>> * move Mark's "masked ndarray objects" into a new fundamental type
>> (ndmasked), leaving the actual ndarray type unchanged. The array_interface
>> keeps the masked array notions and the ufuncs keep the ability to handle
>> arrays like ndmasked. Ideally, numpy.ma would be changed to use
>> ndmasked objects as their core.
> The numpy.ma is unmaintained and I don't see that changing anytime soon.
> As you know, I would prefer 1), but 2) is a good compromise and the infra
> structure for such a flag could be useful for other things, although like
> yourself I'm not sure how it would be implemented. I don't understand your
> proposal for 3), but from the description I don't see that it buys anything.
> That is a bit strong to call numpy.ma unmaintained. I don't consider
> it that way. Are there a lot of tickets for it that are unaddressed?
> Is it broken? I know it gets a lot of use in the wild and so I don't
> think NumPy users would be happy to here it is considered unmaintained by
> NumPy developers.
> I'm looking forward to more details of Mark's proposal for #2.
> The proposal for #3 is quite simple and I think it is also a good
> compromise between removing the masked array entirely from the core NumPy
> object and leaving things as is in master. It keeps the functionality (but
> in a separate object) much like numpy.ma is a separate object.
> Basically it buys not forcing *all* NumPy users (on the C-API level) to
> now deal with a masked array.
To me, it looks like we will get stuck with a more complicated
implementation without changing the API, something that 2) achieves more
easily while providing a feature likely to be useful as we head towards 2.0.
> I know this push is a feature that is part of Mark's intention (as it
> pushes downstream libraries to think about missing data at a fundamental
> level). But, I think this is too big of a change to put in a 1.X
> release. The internal array-model used by NumPy is used quite extensively
> in downstream libraries as a *concept*. Many people have enhanced this
> model with a separate mask array for various reasons, and Mark's current
> use of mask does not satisfy all those use-cases. I don't see how we can
> justify changing the NumPy 1.X memory model under these circumstances.
You keep referring to these ghostly people and their unspecified uses, no
doubt to protect the guilty. You don't have to name names, but a little
detail on what they have done and how they use things would be *very*
> This is the sort of change that in my mind is a NumPy 2.0 kind of change
> where downstream users will be looking for possible array-model changes.
We tried the flag day approach to 2.0 already and it failed. I think it
better to have a long term release and a series of releases thereafter
moving step by step with incremental changes towards a 2.0. Mark's 2) would
support that approach.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion