[Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

Matthew Brett matthew.brett at gmail.com
Thu Apr 2 22:30:52 EDT 2015


Hi,

On Thu, Apr 2, 2015 at 6:09 PM,  <josef.pktd at gmail.com> wrote:
> On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing <efiring at hawaii.edu> wrote:
>> On 2015/04/02 1:14 PM, Hanno Klemm wrote:
>>> Well, I have written quite a bit of code that relies on fancy
>>> indexing, and I think the question, if the behaviour of the []
>>> operator should be changed has sailed with numpy now at version 1.9.
>>> Given the amount packages that rely on numpy, changing this
>>> fundamental behaviour would not be a clever move.
>>
>> Are you *positive* that there is no clever way to make a transition?
>> It's not worth any further thought?
>
> I guess it would be similar to python 3 string versus bytes, but
> without the overwhelming benefits.
>
> I don't think I would be in favor of deprecating fancy indexing even
> if it were possible. In general, my impression is that if there is a
> trade-off in numpy between powerful machinery versus easy to learn and
> teach, then the design philosophy when in favor of power.
>
> I think numpy indexing is not too difficult and follows a consistent
> pattern, and I completely avoid mixing slices and index arrays with
> ndim > 2.

I'm sure y'all are totally on top of this, but for myself, I would
like to distinguish:

* fancy indexing with boolean arrays - I use it all the time and don't
get confused;
* fancy indexing with non-boolean arrays - horrendously confusing,
almost never use it, except on a single axis when I can't confuse it
with orthogonal indexing:

In [3]: a = np.arange(24).reshape(6, 4)

In [4]: a
Out[4]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [5]: a[[1, 2, 4]]
Out[5]:
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [16, 17, 18, 19]])

I also remember a discussion with Travis O where he was also saying
that this indexing was confusing and that it would be good if there
was some way to transition to what he called outer product indexing (I
think that's the same as 'orthogonal' indexing).

> I think it should be DOA, except as a discussion topic for numpy 3000.

I think there are two proposals here:

1) Add some syntactic sugar to allow orthogonal indexing of numpy
arrays, no backward compatibility break.

That seems like a very good idea to me - were there any big objections to that?

2) Over some long time period, move the default behavior of np.array
non-boolean indexing from the current behavior to the orthogonal
behavior.

That is going to be very tough, because it will cause very confusing
breakage of legacy code.

On the other hand, maybe it is worth going some way towards that, like this:

* implement orthogonal indexing as a method arr.sensible_index[...]
* implement the current non-boolean fancy indexing behavior as a
method - arr.crazy_index[...]
* deprecate non-boolean fancy indexing as standard arr[...] indexing;
* wait a long time;
* remove non-boolean fancy indexing as standard arr[...] (errors are
preferable to change in behavior)

Then if we are brave we could:

* wait a very long time;
* make orthogonal indexing the default.

But the not-brave steps above seem less controversial, and fairly reasonable.

What about that as an approach?

Cheers,

Matthew



More information about the NumPy-Discussion mailing list