[Numpy-discussion] On responding to dubious ideas (was: Re: Advanced indexing: "fancy" vs. orthogonal)
Sebastian Berg
sebastian at sipsolutions.net
Thu Apr 9 03:01:26 EDT 2015
On Do, 2015-04-09 at 02:22 -0400, Nathaniel Smith wrote:
> On Wed, Apr 8, 2015 at 4:02 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> > 1. I use numpy in teaching.
> > I have never heard a complaint about its indexing behavior.
> > Have you heard such complaints?
>
> Some observations:
>
> 1) There's an unrelated thread on numpy-discussion right now in which
> a user is baffled by the interaction between slicing and integer fancy
> indexing:
> http://thread.gmane.org/gmane.comp.python.numeric.general/60321
> And one of the three replies AFAICT also doesn't actually make sense,
> in that its explanation relies on broadcasting two arrays with shape
> (5,) against each other to produce an array with shape (5, 5). (Which
> is not how broadcasting works.) To be fair, though, this isn't the
> poster's fault, because they are quoting the documentation!
>
> 2) Again, entirely by coincidence, literally this week a numpy user at
> Berkeley felt spontaneously moved to send a warning message to the
> campus py4science list just to warn everyone about the bizarre
> behaviour they had stumbled on where arr[0, :, idx] produced
> inexplicable results. They had already found the docs and worked out
> what was going on, they just felt it was necessary to warn everyone
> else to be careful out there.
>
> 3) I personally regularly get confused by integer fancy indexing. I
> actually understand it substantially better due to thinking it through
> while reading these threads, but I'm a bit disturbed that I had that
> much left to learn. (New key insight: you can think of *scalar*
> indexing arr[i, j, k] as a function f(i, j, k) -> value. If you take
> that function and make it a ufunc, then you have integer fancy
> indexing. ...Though there's still an extra pound of explanation needed
> to describe the mixed slice/fancy cases, it at least captures the
> basic intuition. Maybe this was already obvious to everyone else, but
> it helped me.)
>
> 4) Even with my New and Improved Explanatory Powers, when this thread
> came up chatting with Thomas Kluyver today, I attempted to provide a
> simple, accurate description of how numpy indexing works so that the
> debate would make sense, and his conclusion was (paraphrased) "okay,
> now I don't understand numpy indexing anymore and never did". I say
> this not to pick on Thomas, but to make that point that Thomas is a
> pretty smart guy so maybe this is actually confusing. (Or maybe I'm
> just terrible at explaining things.)
>
> I actually think the evidence is very very strong that numpy's current
> way of mixing integer fancy indexing and slice-based indexing is a
> mistake. It's just not clear whether there's anything we can do to
> mitigate that mistake (or indeed, what would actually be better even
> if we could start over from scratch). (Which we can't.)
>
I think the best way to think about the mixing is to think about
"subspaces" defined by all of the slices which are taken for each
individual fancy indexing "element". I.e. each subspaces is something
like:
new[:, 0, :] = arr[:, fancy1[0], fancy2[0], :]
then you iterate the fancy indexes so the subspaces moves ahead:
new[:, 1, :] = arr[:, fancy1[1], fancy2[1], :]
new[:, 2, :] = arr[:, fancy1[2], fancy2[2], :]
and so on.
This is also how it is implemented. Plus of course the transposing to
the front when the fancy indices are not consecutive and you cannot add
the fancy dimensions to where they were.
I think you mentioned an error in the docu, I thought I cleared some of
them, but proabably that did not make it more understandable sometimes.
The whole subspace way of is used, but there is a lot of improvement
possible and I would be happy if more feel like stepping up to fill that
void, since you do not need to know the implementation details for that.
- Sebastian
> > 2. One reason I use numpy in teaching is its indexing behavior.
> > What specific language provides a better indexing model,
> > in your opinion?
> >
> > 3. I admit, my students are NOT using non-boolen fancy indexing on
> > multidimensional arrays. (As far as I know.) Are yours?
>
> Well, okay, this would explain it, since integer fancy indexing is
> exactly the confusing case :-) On the plus side, this also means that
> even if pigs started doing barrel-rolls through hell's
> winter-vortex-chilled air tomorrow and we simply removed integer fancy
> indexing, your students would be unaffected :-)
>
> -n
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150409/0d22ce50/attachment.sig>
More information about the NumPy-Discussion
mailing list