[Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

Wed May 23 16:01:36 EDT 2012

On 05/23/2012 10:00 PM, Dag Sverre Seljebotn wrote:
> On 05/23/2012 07:29 PM, Travis Oliphant wrote:
>>
>> On May 23, 2012, at 8:02 AM, Olivier Delalleau wrote:
>>
>>> 2012/5/23 Nathaniel Smith<njs at pobox.com<mailto:njs at pobox.com>>
>>>
>>>      On Wed, May 23, 2012 at 6:06 AM, Travis Oliphant
>>>      <travis at continuum.io<mailto:travis at continuum.io>>  wrote:
>>>      >  I just realized that the pull request doesn't do what I thought
>>>      it did which
>>>      >  is just add the flag to warn users who are writing to an array
>>>      that is a
>>>      >  view when it used to be a copy. It's more cautious and also
>>>      "copies" the
>>>      >  data for 1.7.
>>>      >
>>>      >  Is this really a necessary step? I guess it depends on how many
>>>      use-cases
>>>      >  there are where people are relying on .diagonal() being a copy.
>>>      Given that
>>>      >  this is such an easy thing for people who encounter the warning
>>>      to fix their
>>>      >  code, it seems overly cautious to *also* make a copy (especially
>>>      for a rare
>>>      >  code-path like this --- although I admit that I don't have any
>>>      reproducible
>>>      >  data to support that assertion that it's a rare code-path).
>>>      >
>>>      >  I think we have a mixed record of being cautious (not cautious
>>>      enough in
>>>      >  some changes), but this seems like swinging in the other
>>>      direction of being
>>>      >  overly cautious on a minor point.
>>>
>>>      The reason this isn't a "minor point" is that if we just switched it
>>>      then it's possible that existing, working code would start returning
>>>      incorrect answers, and the only indication would be some console spew.
>>>      I think that such changes should be absolutely verboten for a library
>>>      like numpy. I'm already paranoid enough about my own code!
>>>
>>>      That's why people up-thread were arguing that we just shouldn't risk
>>>      the change at all, ever.
>>>
>>>      I admit to some ulterior motive here: I'd like to see numpy be able to
>>>      continue to evolve, but I am also, like I said, completely paranoid
>>>      about fundamental libraries changing under me. So this is partly my
>>>      attempt to find a way to make a potentially "dangerous" change in a
>>>      responsible way. If we can't learn to do this, then honestly I think
>>>      the only responsible alternative going forward would be to never
>>>      change any existing API except in trivial ways (like removing
>>>      deprecated functions).
>>>
>>>      Basically my suggestion is that every time we alter the behaviour of
>>>      existing, working code, there should be (a) a period when that
>>>      existing code produces a warning, and (b) a period when that existing
>>>      code produces an error. For a change like removing a function, this is
>>>      easy. For something like this diagonal change, it's trickier, but
>>>      still doable.
>>>
>>>
>>> /agree with Nathaniel. Overly cautious is good!
>>>
>>
>> Then are you suggesting that we need to back out the changes to the
>> casting rules as well, because this will also cause code to stop
>> working. This is part of my point. We are not being consistently cautious.
>
> Two wrongs doesn't make one right?
>
> I'd think the inconvenience to users is mostly "per unwarned breakage",
> so that even one unwarned breakage less translates into fewer minutes
> wasted for users scratching their heads.
>
> In the end it's a tradeoff between inconvenience to NumPy developers and
> inconvenience to NumPy users -- not inconveniencing the developers
> further is an argument for not being consistent; but for diagonal() the
> work is already done.

...and, I missed the point about a future-compatible fix implying 
double-copy.

Dag