[Numpy-discussion] Two questions on indexing

Thu Sep 16 04:31:44 EDT 2010

2010/9/15 Mark Fenner <mfenner at gmail.com>:
> One method of using indices seems to be as follows:
>
> In [19]: a = N.array(range(6)).reshape(3,2)
> In [20]: i = N.indices([3,2])
> In [21]: r,c = i
> In [22]: a[r,c]
> Out[22]:
> array([[0, 1],
>       [2, 3],
>       [4, 5]])
>
> In [23]: a[tuple(i)]
> Out[23]:
> array([[0, 1],
>       [2, 3],
>       [4, 5]])
>
> *****Question 1:*****
>
> For using the results of argmax, how would one "pad" the indices
> returned by argmax to be usable as follows (note, I resorted to method
> I found on numpy-discussion ... google "take_axis" ... that revolves
> around rolling the desired axis to the first position and then using
> the indices ... I'm curious how to make something like what I have
> below work).  Incidentally, the number of  dimensions can be arbitrary
> (for my needs):
>
> x = N.array([[2, 3],
>                 [1, 5]])
>
> for a in [0, 1]:
>     indices = N.argmax(x, axis=a)
>
>     # perhaps something similar to the following?
>     # grid = N.indices(x.shape)
>     # grid[a] = indices # <--- this is the idea, but it fails to get
> the necessary result
>     # usable = tuple(grid)
>
>     assert x[usableIndices] == x.max(axis=a) # or similar

I first will try to understand what you want.

I go to the simplest case, 1-dimensional.  There we have only one
axis.  x is sth like [1, 42, 3, 4].  numpy.argmax(x, axis=0) delivers
1 in this case (scalar).  This is because there are no dimensions left
besides the zeroth.  So grid would be [[0, 1, 2, 3]].  Assigning 1 to
this grid in the zeroth component of this grid would work in this
case, and would set all the indexing elements to 1.  Thus, when
indexing with this grid, you would spread out the 42 on the whole axis
(axis=0 in this case).  Is that what you want?

I for now assume that I got your point.  Would be a good idea to post
the desired output too.

When going to higher dimensions, say x.shape = (10, 11, 12), then
grid.shape = (3, 10, 11, 12).  Meaning all elements of grid have again
shape (10, 11, 12).

Your argmax indices array, say in axis=0, has shape (11, 12).  Indexed
with some tuple (j, k) it gives the coordinate of the corresponing
maximum element as (indices[j, k], j, k).  Further grid[i, j, k] gives
the index where to take from in the zeroth axis for element (i, j, k)
of the result array.  You want to spread on the zeroth axis in that
case.  Meaning you want to say grid[0][i, j, k] = indices[j, k] for
all i.  For the zeroth axis, this will work via broadcasting directly:
grid[0] = indices.

For the say 1st axis, axis=1, indices.shape = (10, 12).  You want to
say in this case grid[1][i, j, k] = indices[i, k].  For this to get it
working, you would probably have to reshape with the a shape
containing the 1 in the appropriate place, in this case grid[0] =
indices.reshape((10, 1, 12)).  This shouldn't be super-hard.  Just
take the original shape, and substitute the axis'th value by a one,
and put this into .reshape().  With this reshaped indices,
broadcasting will work as expected.

I guess you want to do some stuff with the "corrected" indices,
otherwise x.max(axis=a) would do the job?

x = numpy.random.random((10, 11, 12))

for axis in xrange(0, 3):
    maximum_indices = x.argmax(axis=axis)
    full_shape = list(x.shape)
    full_shape[axis] = 1

    grid = numpy.indices(x.shape)
    grid[axis] = maximum_indices.reshape(full_shape)

There is some room for optimisation because we create the same grid
all the time.  I did not do this now.

> *****Question 2:*****
>
> A separate question.  Suppose I have a slice for indexing that looks like:
>
> [:, :, 2, :, 5]
>
> How can I get an indexing slice for all OTHER dimension values besides
> those specified.  Conceptually, something like:
>
> [:, :, all but 2, :, all but 5]
>
> Incidentally, the goal is to construct a new array with all those
> "other" spots filled in with zero and the specified spots with their
> original values.  Would it be easier to construct a 0-1 indicator
> array with 1s in the [:,:,2,:,5] positions and multiply it out?  Humm,
> I may have just answered my own question.
>
> For argument sake, how would you do it with indexing/slicing?  I
> suppose one develops some intuition as one gains experience with numpy
> with regards to when to (1) use clever matrix ops and when to (2) use
> clever slicing and when to (3) use a combination of both.

I really think the multiplication approach is a good one, and you can
speed it up using broadcastring.

Assuming x.shape = (10, 11, 12, 13, 14), for your example, you would say then:

factor = numpy.zeros(1, 1, 12, 1, 14, dtype=numpy.bool)
factor[:, :, 2] = True
factor[:, :, :, :, 5] = True

x_masked = x * factor

Notice that as I understand you, you want to have 1s in those cells,
which have either 2 or 5 or both in their indices.  This means, that a
cell is zero iff. it has indices not containing the 2 and not
containing the 5.

On the contrary, when you want to keep only those which have 2 and 5
in their indices, you may use two separate factors as above.

Friedrich