[Numpy-discussion] Missing data wrap-up and request for comments

Charles R Harris charlesr.harris at gmail.com
Wed May 9 16:06:27 EDT 2012


On Wed, May 9, 2012 at 1:35 PM, Travis Oliphant <travis at continuum.io> wrote:

> My three proposals:
>>
>> * do nothing and leave things as is
>>
>> * add a global flag that turns off masked array support by default but
>> otherwise leaves things unchanged (I'm still unclear how this would work
>> exactly)
>>
>> * move Mark's "masked ndarray objects" into a new fundamental type
>> (ndmasked), leaving the actual ndarray type unchanged.  The array_interface
>> keeps the masked array notions and the ufuncs keep the ability to handle
>> arrays like ndmasked.    Ideally, numpy.ma would be changed to use
>> ndmasked objects as their core.
>>
>>
> The numpy.ma is unmaintained and I don't see that changing anytime soon.
> As you know, I would prefer 1), but 2) is a good compromise and the infra
> structure for such a flag could be useful for other things, although like
> yourself I'm not sure how it would be implemented. I don't understand your
> proposal for 3), but from the description I don't see that it buys anything.
>
>
> That is a bit strong to call numpy.ma unmaintained.    I don't consider
> it that way.    Are there a lot of tickets for it that are unaddressed?
> Is it broken?   I know it gets a lot of use in the wild and so I don't
> think NumPy users would be happy to here it is considered unmaintained by
> NumPy developers.
>
> I'm looking forward to more details of Mark's proposal for #2.
>
> The proposal for #3 is quite simple and I think it is also a good
> compromise between removing the masked array entirely from the core NumPy
> object and leaving things as is in master.  It keeps the functionality (but
> in a separate object) much like numpy.ma is a separate object.
>   Basically it buys not forcing *all* NumPy users (on the C-API level) to
> now deal with a masked array.
>

To me, it looks like we will get stuck with a more complicated
implementation without changing the API, something that 2) achieves more
easily while providing a feature likely to be useful as we head towards 2.0.


> I know this push is a feature that is part of Mark's intention (as it
> pushes downstream libraries to think about missing data at a fundamental
> level).    But, I think this is too big of a change to put in a 1.X
> release.   The internal array-model used by NumPy is used quite extensively
> in downstream libraries as a *concept*.  Many people have enhanced this
> model with a separate mask array for various reasons, and Mark's current
> use of mask does not satisfy all those use-cases.   I don't see how we can
> justify changing the NumPy 1.X memory model under these circumstances.
>
>
You keep referring to these ghostly people and their unspecified uses, no
doubt to protect the guilty. You don't have to name names, but a little
detail on what they have done and how they use things would be *very*
helpful.


> This is the sort of change that in my mind is a NumPy 2.0 kind of change
> where downstream users will be looking for possible array-model changes.
>
>
We tried the flag day approach to 2.0 already and it failed. I think it
better to have a long term release and a series of releases thereafter
moving step by step with incremental changes towards a 2.0. Mark's 2) would
support that approach.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120509/5f99c31b/attachment.html>


More information about the NumPy-Discussion mailing list