[Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)
ben.root at ou.edu
Wed May 16 10:04:50 EDT 2012
On Wed, May 16, 2012 at 9:55 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien <nouiz at nouiz.org> wrote:
> > Hi,
> > In fact, I would arg to never change the current behavior, but add the
> > flag for people that want to use it.
> > Why?
> > 1) There is probably >10k script that use it that will need to be
> > checked for correctness. There won't be easy to see crash or error
> > that allow user to see it.
> My suggestion is that we follow the scheme, which I think gives ample
> opportunity for people to notice problems:
> 1.7: works like 1.6, except that a DeprecationWarning is produced if
> (and only if) someone writes to an array returned by np.diagonal (or
> friends). This gives a pleasant heads-up for those who pay attention
> to DeprecationWarnings.
> 1.8: return a view, but mark this view read-only. This causes crashes
> for anyone who ignored the DeprecationWarnings, guaranteeing that
> they'll notice the issue.
> 1.9: return a writeable view, transition complete.
> I've written a pull request implementing the first part of this; I
> hope everyone interested will take a look:
> > 2) This is a globally not significant speed up by this change. Due to
> > 1), i think it is not work it. Why this is not a significant speed up?
> > First, the user already create and use the original tensor. Suppose a
> > matrix of size n x n. If it don't fit in the cache, creating it will
> > cost n * n. But coping it will cost cst * n. The cst is the price of
> > loading a full cache line. But if you return a view, you will pay this
> > cst price later when you do the computation. But it all case, this is
> > cheap compared to the cost of creating the matrix. Also, you will do
> > work on the matrix and this work will be much more costly then the
> > price of the copy.
> > In the case the matrix fix in the cache, the price of the copy is even
> > So in conclusion, optimizing the diagonal won't give speed up in the
> > global user script, but will break many of them.
> I agree that the speed difference is small. I'm more worried about the
> cost to users of having to remember odd inconsistencies like this, and
> to think about whether there actually is a speed difference or not,
> etc. (If we do add a copy=False option, then I guarantee many people
> will use it religiously "just in case" the speed difference is enough
> to matter! And that would suck for them.)
> Returning a view makes the API slightly nicer, cleaner, more
> consistent, more useful. (I believe the reason this was implemented in
> the first place was that providing a convenient way to *write* to the
> diagonal of an arbitrary array made it easier to implement numpy.eye
> for masked arrays.) And the whole point of numpy is to trade off a
> little speed in favor of having a simple, easy-to-work with high-level
> API :-).
> -- Nathaniel
Just as a sanity check, do the scipy tests run without producing any such
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion