[Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

Matthew Brett matthew.brett at gmail.com
Thu Apr 2 23:29:11 EDT 2015


Hi,

On Thu, Apr 2, 2015 at 8:20 PM, Jaime Fernández del Río
<jaime.frio at gmail.com> wrote:
> On Thu, Apr 2, 2015 at 7:30 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Thu, Apr 2, 2015 at 6:09 PM,  <josef.pktd at gmail.com> wrote:
>> > On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing <efiring at hawaii.edu> wrote:
>> >> On 2015/04/02 1:14 PM, Hanno Klemm wrote:
>> >>> Well, I have written quite a bit of code that relies on fancy
>> >>> indexing, and I think the question, if the behaviour of the []
>> >>> operator should be changed has sailed with numpy now at version 1.9.
>> >>> Given the amount packages that rely on numpy, changing this
>> >>> fundamental behaviour would not be a clever move.
>> >>
>> >> Are you *positive* that there is no clever way to make a transition?
>> >> It's not worth any further thought?
>> >
>> > I guess it would be similar to python 3 string versus bytes, but
>> > without the overwhelming benefits.
>> >
>> > I don't think I would be in favor of deprecating fancy indexing even
>> > if it were possible. In general, my impression is that if there is a
>> > trade-off in numpy between powerful machinery versus easy to learn and
>> > teach, then the design philosophy when in favor of power.
>> >
>> > I think numpy indexing is not too difficult and follows a consistent
>> > pattern, and I completely avoid mixing slices and index arrays with
>> > ndim > 2.
>>
>> I'm sure y'all are totally on top of this, but for myself, I would
>> like to distinguish:
>>
>> * fancy indexing with boolean arrays - I use it all the time and don't
>> get confused;
>> * fancy indexing with non-boolean arrays - horrendously confusing,
>> almost never use it, except on a single axis when I can't confuse it
>> with orthogonal indexing:
>>
>> In [3]: a = np.arange(24).reshape(6, 4)
>>
>> In [4]: a
>> Out[4]:
>> array([[ 0,  1,  2,  3],
>>        [ 4,  5,  6,  7],
>>        [ 8,  9, 10, 11],
>>        [12, 13, 14, 15],
>>        [16, 17, 18, 19],
>>        [20, 21, 22, 23]])
>>
>> In [5]: a[[1, 2, 4]]
>> Out[5]:
>> array([[ 4,  5,  6,  7],
>>        [ 8,  9, 10, 11],
>>        [16, 17, 18, 19]])
>>
>> I also remember a discussion with Travis O where he was also saying
>> that this indexing was confusing and that it would be good if there
>> was some way to transition to what he called outer product indexing (I
>> think that's the same as 'orthogonal' indexing).
>>
>> > I think it should be DOA, except as a discussion topic for numpy 3000.
>>
>> I think there are two proposals here:
>>
>> 1) Add some syntactic sugar to allow orthogonal indexing of numpy
>> arrays, no backward compatibility break.
>>
>> That seems like a very good idea to me - were there any big objections to
>> that?
>>
>> 2) Over some long time period, move the default behavior of np.array
>> non-boolean indexing from the current behavior to the orthogonal
>> behavior.
>>
>> That is going to be very tough, because it will cause very confusing
>> breakage of legacy code.
>>
>> On the other hand, maybe it is worth going some way towards that, like
>> this:
>>
>> * implement orthogonal indexing as a method arr.sensible_index[...]
>> * implement the current non-boolean fancy indexing behavior as a
>> method - arr.crazy_index[...]
>> * deprecate non-boolean fancy indexing as standard arr[...] indexing;
>> * wait a long time;
>> * remove non-boolean fancy indexing as standard arr[...] (errors are
>> preferable to change in behavior)
>>
>> Then if we are brave we could:
>>
>> * wait a very long time;
>> * make orthogonal indexing the default.
>>
>> But the not-brave steps above seem less controversial, and fairly
>> reasonable.
>>
>> What about that as an approach?
>
>
> Your option 1 was what was being discussed before the posse was assembled to
> bring fancy indexing before justice... ;-)

Yes, sorry - I was trying to bring the argument back there.

> My background is in image processing, and I have used fancy indexing in all
> its fanciness far more often than orthogonal or outer product indexing. I
> actually have a vivid memory of the moment I fell in love with NumPy: after
> seeing a code snippet that ran a huge image through a look-up table by
> indexing the LUT with the image. Beautifully simple. And here is a younger
> me, learning to ride NumPy without the training wheels.
>
> Another obvious use case that you can find all over the place in
> scikit-image is drawing a curve on an image from the coordinates.

No question at all that it does have its uses - but then again, no-one
thinks that it should not be available, only, maybe, in the very far
future, not what you get by default...

Cheers,

Matthew



More information about the NumPy-Discussion mailing list