[Numpy-discussion] Copy vs View for array[array] (was Histograms via indirect index arrays)

Rick White rlw at stsci.edu
Fri Mar 17 14:57:10 EST 2006


On Fri, 17 Mar 2006, Tim Hochberg wrote:

> In theory I'm all for view semantics for an array indexed by an array
> (I'm sure we have a good name for that, but it's escaping me). Indexing
> in numpy can be confusing enough without some indexing operations
> returning views and others copies. This is orthogonal to any issues of
> performance.
>
> In practice, I'm a bit skeptical. The result would need to be some sort
> of psuedo array object (similar to array.flat). Operations on this
> object would necessarily have worse performance than operations on a
> normal array due to the added level of indirection. In some
> circumstances it would also hold onto a lot of memory that might
> otherwise be freed since it hold a reference to the data for both the
> original array and the index array.

Actually I think it is worse than that -- it seems to me that it
actually has to make a *copy* of the index array.  I don't think
that we would want to keep only a reference to the index array,
since if it changed then the view could respond by changing in very
unexpected ways.  That sounds like a nightmare side-effect to me.

That's what has always made me think that this is not a good idea,
even if the bookkeeping of carrying around an unevaluated array+indices
could be worked out efficiently.  In my applications I sometimes
use very large index arrays, and I don't want to have to copy them
unnecessarily.  Generally I much prefer instant evaluation as in
the current implementation, since that uses the minimum of memory.

For what it's worth, IDL behaves exactly like the current numpy:
a[idx] += 1 increments each element by 1 regardless of how many
times a particular index is included in idx.




More information about the NumPy-Discussion mailing list