[Numpy-discussion] The future of ndarray.diagonal()

Sat Jan 3 14:49:44 EST 2015

On 1 Jan 2015 21:35, "Alexander Belopolsky" <ndarray at mac.com> wrote:
>
> A discussion [1] is currently underway at GitHub which will benefit from
a larger forum.
>
> In version 1.9, the diagonal() method was changed to return a read-only
(non-contiguous) view into the original array instead of a plain copy.
Also, it has been announced [2] that in 1.10 the view will become
read/write.
>
> A concern has now been raised [3] that this change breaks backward
compatibility too much.
>
> Consider the following code:
>
> x = numy.eye(2)
> d = x.diagonal()
> d[0] = 2
>
> In 1.8, this code runs without errors and results in [2, 1] stored in
array d.  In 1.9, this is an error.  With the current plan, in 1.10 this
will become valid again, but the result will be different: x[0,0] will be 2
while it is 1 in 1.8.

Further context:

In 1.7 and 1.8, the code above works as described, but also issues a
visible-by-default warning:

>>> np.__version__
'1.7.2'
>>> x = np.eye(2)
>>> x.diagonal()[0] = 2
__main__:1: FutureWarning: Numpy has detected that you (may be) writing to
an array returned
by numpy.diagonal or by selecting multiple fields in a record
array. This code will likely break in the next numpy release --
see numpy.diagonal or arrays.indexing reference docs for details.
The quick fix is to make an explicit copy (e.g., do
arr.diagonal().copy() or arr[['f0','f1']].copy()).

1.7 was released in Feb. 2013, ~22 months ago. (I'm not implying this
number is particularly large or small, it's just something that I find
useful to calculate when thinking about things like this.)

The choice of "1.10" as the target for completing this change is
more-or-less a strawman and we shouldn't feel bound by it. The schedule was
originally written in between the 1.6 and 1.7 releases, when our release
process was kinda broken and we had no idea what the future release
schedule would look like (1.6 -> 1.7 ultimately ended up being a ~21 month
gap). We've already adjusted the schedule for this deprecation once before
(see issue #596: The original schedule called for the change to returning a
ro-view to happen in 1.8, rather than 1.9 as it actually did). Now that our
release frequency is higher, 1.11 might well be a more reasonable target
than 1.10.

As for the overall question, this is really a bigger question about what
strategy we should use in general to balance between conservatism (which is
a Good Thing) and making improvements (which is also a Good Thing). The
post you cite brings this up explicitly:

> [3] http://khinsen.wordpress.com/2014/09/12/the-state-of-numpy/

I have huge respect for the problems and pain that Konrad describes in this
blog post, but I really can't agree with the argument or the conclusions.
His conclusion is that when it comes to compatibility breaks,
slow-incremental-change is bad, and that we should instead prefer big
all-at-once compatibility breaks like the Numeric->Numpy or Py2->Py3
transitions. But when describing his own experiences that he uses to
motivate this, he says:

*"The two main dependencies of my code, NumPy and Python itself, did
sometimes introduce incompatible changes (by design or as consequences of
bug fixes) that required changes on my own code base, but they were
surprisingly minor and never required more than about a day of work."*

i.e., slow-incremental-change has actually worked well in his experience.
(And in particular, the np.diagonal issue only comes in as an example to
illustrate what he means by the phrase "slow continuous change" -- this
particular change hasn't actually broken anything in his code.) OTOH the
big problem that motivated his post was that his code is all written
against the APIs of the ancient and long-abandoned Numeric project, and he
finds the costs of transitioning them to the "new" numpy APIs to be
prohibitively expensive, i.e. this big-bang transition broke his code. (It
did manage to limp on for some years b/c numpy used to contain some
compatibility code to emulate the Numeric API, but this doesn't really
change the basic situation: there were two implementations of the API he
needed -- numpy.numeric and Numeric itself -- and both implementations
still exist in the sense that you can download them, but neither is usable
because no-one's willing to maintain them anymore.) Maybe I'm missing
something, but his data seems to be pi radians off from his conclusion.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150103/feebdb96/attachment.html>