[Numpy-discussion] add .H attribute?

Nathaniel Smith njs at pobox.com
Wed Jul 24 06:54:28 EDT 2013


On Wed, Jul 24, 2013 at 9:23 AM, Dave Hirschfeld
<dave.hirschfeld at gmail.com> wrote:
> If we're voting my vote goes to add the .H attribute for all the reasons
> Alan has specified. Document that it returns a copy but that it may in
> future return a view so it it not future proof to operate on the result
> inplace.

As soon as you talk about attributes "returning" things you've already
broken Python's mental model... attributes are things that sit there,
not things that execute arbitrary code. Of course this is not how the
actual implementation works, attribute access *can* in fact execute
arbitrary code, but the mental model is important, so we should
preserve it where-ever we can. Just mentioning an attribute should not
cause unbounded memory allocations.

Consider these two expressions:
  x = solve(dot(arr, arr.T), arr.T)
  x = solve(dot(arr, arr.H), arr.H)

Mathematically, they're very similar, and the mathematics-like
notation does a good job of expressing that similarity while hiding
mathematically irrelevant details. Which is what mathematical notation
is for.

But numpy isn't a toolkit for writing mathematical formula, it's a
toolkit for writing computational algorithms that implement
mathematical formula, and algorithmically, those two expressions are
radically different. The first one allocates one temporary (the result
from 'dot'); the second one allocates 3 temporaries. The second one is
gratuitously inefficient, since two of those temporaries are
identical, but they're being computed twice anyway.

> I'm sceptical that there's much code out there actually relying on the fact
> that a transpose is a view with the specified intention of altering the
> original array inplace.
>
> I work with a lot of beginners and whenever I've seen them operate inplace
> on a transpose it has been a bug in the code, leading to a discussion of
> how, for performance reasons, numpy will return a view where possible,
> leading to yet further discussion of when it is and isn't possible to return
> a view.

The point isn't that there's code that relies specifically on .T
returning a view. It's that to be a good programmer, you need to *know
whether* it returns a view -- exactly as you say in the second
paragraph. And a library should not hide these kinds of details.

-n



More information about the NumPy-Discussion mailing list