[Numpy-discussion] Logical indexing and higher-dimensional arrays.

Wed Feb 8 09:49:50 EST 2012

On Feb 8, 2012, at 8:29 AM, Sturla Molden wrote:

> On 08.02.2012 06:01, Travis Oliphant wrote:
> 
>> Recall that the shape of the output with fancy indexing is determined by broadcasting together the indexing objects and using that as the shape of the output:
>> 
>> x[ind1, ind2] will produce an output with the shape of "broadcast(ind1, ind2)" whose elements are selected by the broadcasted tuple.
> 
> 
> 
> 
> Thank you for the explanation, Travis.
> 
> I think my main confusion is why we broadcast indices at all. 
> Broadcasting is something I would expect to happen for binary operations 
> between ndarrays, not for indexing.

We broadcast because of the "zip-based" semantics of current fancy-indexing which is basically an element-by-element operation:    x[ind1, ind2]  will choose it's elements out of x by taken elements from ind1 as the first index and elements out of ind2 as the second. 

As an essentially element-by-element operation, if the shapes of the input arrays are not exactly the same, it is natural to broadcast them to the same shape.   This is fairly intuitive if you are used to broadcasting in NumPy.   The problem is that it does not coincide other intuitions in certain cases. 

This behavior actually allows easy-to-access "cross-product" behavior by broadcasting a 1-d ind1 shaped as (4,) with a ind2 shaped as (4,1).  The numpy.ix_  function does the basic reshaping needed so that 0:2, 0:2 matches [0,1],[0,1] indexing:   x[ix_([0,1],[0,1])] is the same as x[0:2,0:2].  

There are also some very nice applications where you can select out of a 3-d volume a depth-surface defined by indexes like so: 

	arr[ i[:,newaxis], j, depth]

where arr is a 3-d array,  i and j are 1-d index arrays: i = arange(arr.shape[0]) and j = arange(arr.shape[1]), and depth is a 2-d array of "depths".   The selected result will be 2-d. 

When you understand what is happening, it is consistent and it does make sense and it has some very nice applications.    I recognize that people get confused because their expectations are different, but the current behavior can be understood with a fairly simple mental model. 

I could support, however, a move to push this style of indexing to a method, and make fancy-indexing use cross-product semantics if: 

	* we do it in a phased-manner with lot's of deprecation warnings and even a tool that helps you change code (the tool would have to "play" your code and make a note of the locations where this style of indexing was used --- because static code analysis wouldn't know which objects were arrays and which weren't).  

	* we also make the ndarray object general enough so that fancy-indexing could be returned as a view on the array (the same as regular indexing)

This sort of thing would take time, but is not out of the question in my mind because I suspect the number of users and use-cases of "broadcasted" fancy-indexing is small.  

-Travis

> 
> (But it might be that I don't fully understand the problem.)
> 
> Sturla
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion