On Sun, Mar 31, 2013 at 6:14 AM, Ivan Oseledets
I am using numpy 1.6.1, and encountered a wierd fancy indexing bug:
import numpy as np c = np.random.randn(10,200,10);
In [29]: print c[[0,1],:200,:2].shape (2, 200, 2)
In [30]: print c[[0,1],:200,[0,1]].shape (2, 200)
It means, that here fancy indexing is not working right for a 3d array.
On Sat, Mar 30, 2013 at 11:01 AM, Ivan Oseledets
wrote: I am using numpy 1.6.1, and encountered a wierd fancy indexing bug:
import numpy as np c = np.random.randn(10,200,10);
In [29]: print c[[0,1],:200,:2].shape (2, 200, 2)
In [30]: print c[[0,1],:200,[0,1]].shape (2, 200)
It means, that here fancy indexing is not working right for a 3d array.
--> It is working fine, review the docs:
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-inde...
In your return, item [0, :] is c[0, :, 0] and item[1, :]is c[1, :, 1].
If you want a return of shape (2, 200, 2) where item [i, :, j] is c[i, :, j] you could use slicing:
c[:2, :200, :2]
or something more elaborate like:
c[np.arange(2)[:, None, None], np.arange(200)[:, None], np.arange(2)]
Jaime --->
Oh! So it is not a bug, it is a feature, which is completely incompatible with other array based languages (MATLAB and Fortran). To me, I can not find a single explanation why it is so in numpy. Taking submatrices from a matrix is a common operation and the syntax above is very natural to take submatrices, not a weird diagonal stuff. i.e.,
c = np.random.randn(100,100) d = c[[0,3],[2,3]]
should NOT produce two numbers! (and you can not do it using slices!)
In MATLAB and Fortran c(indi,indj) will produce a 2 x 2 matrix. How it can be done in numpy (and why the complications?)> So, please consider this message as a feature request.
Numpy's handling of such things is strictly more general than MATLAB and Fortran's (AFAIK), and fits in with the rest of the system quite nicely, I'd say. The logic is: if you do a[row_coords, col_coords] then it treats row_coords and col_coords as parallel arrays, where each corresponding pair of entries in the two arrays gives the coordinates of one entry in the result -- so you get [a[row_coords[0], col_coords[0]], a[row_coords[1], col_coords[1]], ...]. This follows numpy's usual rules for arrays that are supposed to align: just like if you did 'row_coords + col_coords', say, which would give you [row_coords[0] + col_coords[0], row_coords[1] + col_coords[1], ...]. AND, all of this works just the same if row_coords and col_coords are arbitrary-dimensional arrays: the output of indexing (just like the output of '+') will have the same dimensionality as its input. So if they're both 2x2 arrays, then a[row_coords, col_coords] will be [[a[row_coords[0, 0], col_coords[0, 0]], a[row_coords[0, 1], col_coords[0, 1]], [a[row_coords[1, 0], col_coords[1, 0]], a[row_coords[1, 1], col_coords[1, 1]], ] AND (here's the solution to your problem), this "aligning" uses the same rules as "+" does, i.e., broadcasting applies: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html So for your example, write c[[[0], [1]], :200, [[0, 1]]] which by broadcasting is equivalent to c[[[0, 1], [0, 1]], :200, [[0, 0], [1, 1]]] which is what you want. The only problem then is that slicing indexing happens "inside" fancy indexing ("fancy indexing" is the name for this line-up-the-arrays-and-extract-coordinates thing). So what this means is that when you use both inside a single indexing operation, the what it does is for each set of corresponding fancy indexing coordinates, it doesn't just extract a single element for the final array -- it extracts an entire sub-array. The practical result is that the shape of your output array will be the shape of your aligned fancy indexing arrays (after broadcasting), and then with any sliced axes stuck on the end. So you might expect that the above expression would give you an array of shape (2, 200, 2) but in fact it will be of shape (2, 2, 200) because the (2, 2) part is the shape of the fancy index arrays, and the (200,) is the shape of the slices. (Notice that there is no relation at all between the shape of the input array and the shape of the fancy indexes!) You can fix this up with a call to rollaxes(), or by getting even fancier, using fancy indexes for all three axes, and making sure they each broadcast to the shape (2, 200, 2): coord1 = np.asarray([0, 1]).reshape((2, 1, 1)) coord2 = np.arange(200).reshape((1, 200, 1)) coord3 = np.asarray([0, 1]).reshape((1, 1, 2)) c[coord1, coord2, coord3] -n