[Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)

Matthew Brett matthew.brett at gmail.com
Mon Apr 16 23:08:03 EDT 2012


Hi,

On Mon, Apr 16, 2012 at 7:46 PM, Travis Oliphant <travis at continuum.io> wrote:
>
> On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote:
>
>> Hi,
>>
>> On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant <travis at continuum.io> wrote:
>>
>>> I have heard from a few people that they are not excited by the growth of
>>> the NumPy data-structure by the 3 pointers needed to hold the masked-array
>>> storage.   This is especially true when there is talk to potentially add
>>> additional attributes to the NumPy array (for labels and other
>>> meta-information).      If you are willing to let us know how you feel about
>>> this, please speak up.
>>
>> I guess there are two questions here
>>
>> 1) Will something like the current version of masked arrays have a
>> long term future in numpy, regardless of eventual API? Most likely
>> answer - yes?
>
> I think the answer to this is yes, but it could be as a feature-filled sub-class (like the current numpy.ma, except in C).

I'd love to hear that argument fleshed out in more detail - do you have time?

>> 2) Will likely changes to the masked array API make any difference to
>> the number of extra pointers?  Likely answer no?
>>
>> Is that right?
>
> The answer to this is very likely no on the Python side.  But, on the C-side, their could be some differences (i.e. are masked arrays a sub-class of the ndarray or not).
>
>>
>> I have the impression that the masked array API discussion still has
>> not come out fully into the unforgiving light of discussion day, but
>> if the answer to 2) is No, then I suppose the API discussion is not
>> relevant to the 3 pointers change.
>
> You are correct that the API discussion is separate from this one.     Overall,  I was surprised at how fervently people would oppose ABI changes.   As has been pointed out, NumPy and Numeric before it were not really designed to prevent having to recompile when changes were made.   I'm still not sure that a better overall solution is not to promote better availability of downstream binary packages than excessively worry about ABI changes in NumPy.    But, that is the current climate.

The objectors object to any binary ABI change, but not specifically
three pointers rather than two or one?

Is their point then about ABI breakage?  Because that seems like a
different point again.

Or is it possible that they are in fact worried about the masked array API?

> Mark and I will talk about this long and hard.  Mark has ideas about where he wants to see NumPy go, but I don't think we have fully accounted for where NumPy and its user base *is* and there may be better ways to approach this evolution.    If others are interested in the outcome of the discussion please speak up (either on the list or privately) and we will make sure your views get heard and accounted for.

I started writing something about this but I guess you'd know what I'd
write, so I only humbly ask that you consider whether it might be
doing real damage to allow substantial discussion that is not
documented or argued out in public.

See you,

Matthew



More information about the NumPy-Discussion mailing list