[Numpy-discussion] On responding to dubious ideas (was: Re: Advanced indexing: "fancy" vs. orthogonal)

Thu Apr 9 02:22:17 EDT 2015

On Wed, Apr 8, 2015 at 4:02 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> 1. I use numpy in teaching.
> I have never heard a complaint about its indexing behavior.
> Have you heard such complaints?

Some observations:

1) There's an unrelated thread on numpy-discussion right now in which
a user is baffled by the interaction between slicing and integer fancy
indexing:
    http://thread.gmane.org/gmane.comp.python.numeric.general/60321
And one of the three replies AFAICT also doesn't actually make sense,
in that its explanation relies on broadcasting two arrays with shape
(5,) against each other to produce an array with shape (5, 5). (Which
is not how broadcasting works.) To be fair, though, this isn't the
poster's fault, because they are quoting the documentation!

2) Again, entirely by coincidence, literally this week a numpy user at
Berkeley felt spontaneously moved to send a warning message to the
campus py4science list just to warn everyone about the bizarre
behaviour they had stumbled on where arr[0, :, idx] produced
inexplicable results. They had already found the docs and worked out
what was going on, they just felt it was necessary to warn everyone
else to be careful out there.

3) I personally regularly get confused by integer fancy indexing. I
actually understand it substantially better due to thinking it through
while reading these threads, but I'm a bit disturbed that I had that
much left to learn. (New key insight: you can think of *scalar*
indexing arr[i, j, k] as a function f(i, j, k) -> value. If you take
that function and make it a ufunc, then you have integer fancy
indexing. ...Though there's still an extra pound of explanation needed
to describe the mixed slice/fancy cases, it at least captures the
basic intuition. Maybe this was already obvious to everyone else, but
it helped me.)

4) Even with my New and Improved Explanatory Powers, when this thread
came up chatting with Thomas Kluyver today, I attempted to provide a
simple, accurate description of how numpy indexing works so that the
debate would make sense, and his conclusion was (paraphrased) "okay,
now I don't understand numpy indexing anymore and never did". I say
this not to pick on Thomas, but to make that point that Thomas is a
pretty smart guy so maybe this is actually confusing. (Or maybe I'm
just terrible at explaining things.)

I actually think the evidence is very very strong that numpy's current
way of mixing integer fancy indexing and slice-based indexing is a
mistake. It's just not clear whether there's anything we can do to
mitigate that mistake (or indeed, what would actually be better even
if we could start over from scratch). (Which we can't.)

> 2. One reason I use numpy in teaching is its indexing behavior.
> What specific language provides a better indexing model,
> in your opinion?
>
> 3. I admit, my students are NOT using non-boolen fancy indexing on
> multidimensional arrays. (As far as I know.)  Are yours?

Well, okay, this would explain it, since integer fancy indexing is
exactly the confusing case :-) On the plus side, this also means that
even if pigs started doing barrel-rolls through hell's
winter-vortex-chilled air tomorrow and we simply removed integer fancy
indexing, your students would be unaffected :-)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org