[Numpy-discussion] Making numpy sensible: backward compatibility please

Travis Oliphant travis at continuum.io
Fri Sep 28 18:15:36 EDT 2012

On Sep 28, 2012, at 4:53 PM, Henry Gomersall wrote:

> On Fri, 2012-09-28 at 16:43 -0500, Travis Oliphant wrote:
>> I agree that we should be much more cautious about semantic changes in
>> the 1.X series of NumPy.    How we handle situations where 1.6 changed
>> things from 1.5 and wasn't reported until now is an open question and
>> depends on the particular problem in question.    I agree that we
>> should be much more cautious about changes (particularly semantic
>> changes that will break existing code). 
> One thing I noticed in my (short and shallow) foray into numpy
> development was the rather limited scope of the tests in the area I
> touched (fft). I know not the extent to which this is true across the
> code base, but I know from experience the value of a truly exhaustive
> test set (every line tested for every condition). Perhaps someone with a
> deeper knowledge could comment on this?

Thank you for bringing this up.  It is definitely a huge flaw of NumPy that it does not have more extensive testing.  It is a result of the limited resources under which NumPy has been developed.    We are trying to correct this problem over time --- but it takes time.    In the mean time, there is a huge install base of code out there which acts as a de-facto test suite of NumPy.   We just need to make sure those tests actually get run on new versions of NumPy and we get reports back of failures --- especially when subtle changes have taken place in the way things work (iteration in ufuncs and coercion rules being the most obvious).   This results in longer release cycles if releases contain code that significantly change the way things work (removed APIs, altered coercion rules, etc.)

The alteration of the semantics of how the base attribute works is a good example.  Everyone felt it was a good idea to have the .base attribute point to the actual object holding the memory (and it fixed a well-known example of how you could crash Python by building up a stack of array-object references). However, our fix created a problem for code that uses memmap objects and relied on the fact that the .base attribute would hold a reference to the most recent *memmap* object.   This was an unforeseen problem with our change.   

On the other hand, change is a good thing and we don't want NumPy to stop getting improvements.   We just have to be careful that we don't allow our enthusiasm for new features and changes to over-ride our responsibility to end-users.   I appreciate the efforts of all the NumPy developers in working through the inevitable debates that differences in perspective on that fundamental trade-off will bring.  



More information about the NumPy-Discussion mailing list