[Numpy-discussion] Use-case for np.choose
Anne Archibald
peridot.faceted at gmail.com
Sun Nov 8 22:40:05 EST 2009
As Josef said, this is not correct. I think the key point of confusion is this:
Do not pass choose two arrays.
Pass it one array and a *list* of arrays. The fact that choices can be
an array is a quirk we can't change, but you should think of the
second argument as a list of arrays, possibly of different shapes.
np.choose(np.ones((2,1,1)), [ np.ones((1,3,1)), np.ones((1,1,5)) ] )
Here the first argument is an array of choices. The second argument is
a *list* - if you cast it to an array you'll get an error or nonsense
- of arrays to choose from. The broadcasting ensures the first
argument and *each element of the list* are the same shape. The only
constraint on the number of arrays in the list is that it be larger
than the largest value in a.
If you try to make the second argument into a single array, for one
thing you are throwing away useful generality by forcing each choice
to be the same shape (and real shape, not zero-strided fake shape),
and for another the broadcasting becomes very hard to understand.
Anne
2009/11/8 David Goldsmith <d.l.goldsmith at gmail.com>:
> OK, now I'm trying to wrap my brain around broadcasting in choose when both
> `a` *and* `choices` need to be (non-trivially) broadcast in order to arrive
> at a common shape, e.g.:
>
>>>> c=np.arange(4).reshape((2,1,2)) # shape is (2,1,2)
>>>> a=np.eye(2, dtype=int) # shape is (2,2)
>>>> np.choose(a,c)
> array([[2, 1],
> [0, 3]])
>
> (Unfortunately, the implementation is in C, so I can't easily insert print
> statements to see intermediate results.)
>
> First, let me confirm that the above is indeed an example of what I think it
> is, i.e., both `a` and `choices` are broadcast in order for this to work,
> correct? (And if incorrect, how is one broadcast to the shape of the
> other?) Second, both are broadcast to shape (2,2,2), correct? But how,
> precisely, i.e., does c become
>
> [[[0, 1], [2, 3]], [[[0, 1], [0, 1]],
> [[0, 1], [2, 3]]] or [[2, 3], [2, 3]]]
>
> and same question for a? Then, once a is broadcast to a (2,2,2) shape,
> precisely how does it "pick and choose" from c to create a (2,2) result?
> For example, suppose a is broadcast to:
>
> [[[1, 0], [0, 1]],
> [[1, 0], [0, 1]]]
>
> (as indicated above, I'm uncertain at this point if this is indeed what a is
> broadcast to); how does this create the (2,2) result obtained above?
> (Obviously this depends in part on precisely how c is broadcast, I do
> recognize that much.)
>
> Finally, a seemingly relevant comment in the C source is:
>
> /* Broadcast all arrays to each other, index array at the end.*/
>
> This would appear to confirm that "co-broadcasting" is performed if
> necessary, but what does the "index array at the end" phrase mean?
>
> Thanks for your continued patience and tutelage.
>
> DG
>
> On Sun, Nov 8, 2009 at 5:36 AM, <josef.pktd at gmail.com> wrote:
>>
>> On Sun, Nov 8, 2009 at 5:00 AM, David Goldsmith <d.l.goldsmith at gmail.com>
>> wrote:
>> > On Sun, Nov 8, 2009 at 12:57 AM, Anne Archibald
>> > <peridot.faceted at gmail.com>
>> > wrote:
>> >>
>> >> 2009/11/8 David Goldsmith <d.l.goldsmith at gmail.com>:
>> >> > On Sat, Nov 7, 2009 at 11:59 PM, Anne Archibald
>> >> > <peridot.faceted at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> 2009/11/7 David Goldsmith <d.l.goldsmith at gmail.com>:
>> >> >> > So in essence, at least as it presently functions, the shape of
>> >> >> > 'a'
>> >> >> > *defines* what the individual choices are within 'choices`, and if
>> >> >> > 'choices'
>> >> >> > can't be parsed into an integer number of such individual choices,
>> >> >> > that's
>> >> >> > when an exception is raised?
>> >> >>
>> >> >> Um, I don't think so.
>> >> >>
>> >> >> Think of it this way: you provide np.choose with a selector array,
>> >> >> a,
>> >> >> and a list (not array!) [c0, c1, ..., cM] of choices. You construct
>> >> >> an
>> >> >> output array, say r, the same shape as a (no matter how many
>> >> >> dimensions it has).
>> >> >
>> >> > Except that I haven't yet seen a working example with 'a' greater
>> >> > than
>> >> > 1-D,
>> >> > Josef's last two examples notwithstanding; or is that what you're
>> >> > saying
>> >> > is
>> >> > the bug.
>> >>
>> >> There's nothing magic about A being one-dimensional.
>> >>
>> >> C = np.random.randn(2,3,5)
>> >> A = (C>-1).astype(int) + (C>0).astype(int) + (C>1).astype(int)
>> >>
>> >> R = np.choose(A, (-1, -C, C, 1))
>> >
>> > OK, now I get it: np.choose(A[0,:,:], (-1,-C,C,-1)) and
>> > np.choose(A[0,:,0].reshape((3,1)), (-1,-C,C,1)), e.g., also work, but
>> > np.choose(A[0,:,0], (-1,-C,C,-1)) doesn't - what's necessary for
>> > choose's
>> > arguments is that both can be broadcast to a common shape (as you state
>> > below), but choose won't reshape the arguments for you to make this
>> > possible, you have to do so yourself first, if necessary. That does
>> > appear
>> > to be what's happening now; but do we want choose to be smarter than
>> > that
>> > (e.g., for np.choose(A[0,:,0], (-1,-C,C,-1)) to work, so that the user
>> > doesn't need to include the .reshape((3,1)))?
>>
>> No, I don't think we want to be that smart.
>>
>> If standard broadcasting rules apply, as I think they do, then I wouldn't
>> want
>> any special newaxis or reshapes done automatically. It will be confusing,
>> the function wouldn't know what to do if there are, e.g., as many rows as
>> columns, and this looks like a big source of errors.
>> Standard broadcasting is pretty nice (once I got the hang of it), and
>> adding
>> a lot of np.newaxis (or some reshapes) to the code is only a small price
>> to pay.
>>
>> Josef
>>
>>
>>
>> >
>> > DG
>> >
>> >>
>> >> Requv = np.minimum(np.abs(C),1)
>> >>
>> >> or:
>> >>
>> >> def wedge(*functions):
>> >> """Return a function whose value is the minimum of those of
>> >> functions"""
>> >> def wedgef(X):
>> >> fXs = [f(X) for f in functions]
>> >> A = np.argmin(fXs, axis=0)
>> >> return np.choose(A,fXs)
>> >> return wedgef
>> >>
>> >> so e.g. np.abs is -wedge(lambda X: X, lambda X: -X)
>> >>
>> >> This works no matter what shape of X the user supplies - so a wedged
>> >> function can be somewhat ufunclike - by making A the same shape.
>> >>
>> >> >> The (i0, i1, ..., iN) element of the output array
>> >> >> is obtained by looking at the (i0, i1, ..., iN) element of a, which
>> >> >> should be an integer no larger than M; say j. Then r[i0, i1, ...,
>> >> >> iN]
>> >> >> = cj[i0, i1, ..., iN]. That is, each element of the selector array
>> >> >> determines which of the choice arrays to pull the corresponding
>> >> >> element from.
>> >> >
>> >> > That's pretty clear (thanks for doing my work for me). ;-), Yet, see
>> >> > above.
>> >> >
>> >> >> For example, suppose that you are processing an array C, and have
>> >> >> constructed a selector array A the same shape as C in which a value
>> >> >> is
>> >> >> 0, 1, or 2 depending on whether the C value is too small, okay, or
>> >> >> too
>> >> >> big respectively. Then you might do something like:
>> >> >>
>> >> >> C = np.choose(A, [-inf, C, inf])
>> >> >>
>> >> >> This is something you might want to do no matter what shape A and C
>> >> >> have. It's important not to require that the choices be an array of
>> >> >> choices, because they often have quite different shapes (here, two
>> >> >> are
>> >> >> scalars) and it would be wasteful to broadcast them up to the same
>> >> >> shape as C, just to stack them.
>> >> >
>> >> > OK, that's a pretty generic use-case, thanks; let me see if I
>> >> > understand
>> >> > it
>> >> > correctly: A is some how created independently with a 0 everywhere C
>> >> > is
>> >> > too
>> >> > small, a 1 everywhere C is OK, and a 2 everywhere C is too big; then
>> >> > np.choose(A, [-inf, C, inf]) creates an array that is -inf everywhere
>> >> > C
>> >> > is
>> >> > too small, inf everywhere C is too large, and C otherwise (and since
>> >> > -inf
>> >> > and inf are scalars, this implies broadcasting of these is taking
>> >> > place).
>> >> > This is what you're asserting *should* be the behavior. So, unless
>> >> > there is
>> >> > disagreement about this (you yourself said the opposite viewpoint
>> >> > might
>> >> > rationally be held) np.choose definitely presently has a bug, namely,
>> >> > the
>> >> > index array can't be of arbitrary shape.
>> >>
>> >> There seems to be some disagreement between versions, but both Josef
>> >> and I find that the index array *can* be arbitrary shape. In numpy
>> >> 1.2.1 I find that all the choose items must be the same shape as it,
>> >> which I think is a bug.
>> >>
>> >> What I suggested might be okay was if the index array was not
>> >> broadcasted, so that the outputs always had exactly the same shape as
>> >> the index array. But upon reflection it's useful to be able to use a
>> >> 1-d array to select rows from a set of matrices, so I now think that
>> >> all of A and the elements of choose should be broadcast to the same
>> >> shape. This seems to be what Josef observes in his version of numpy,
>> >> so maybe there's nothing to do.
>> >>
>> >> Anne
>> >>
>> >> > DG
>> >> >
>> >> >>
>> >> >> Anne
>> >> >> _______________________________________________
>> >> >> NumPy-Discussion mailing list
>> >> >> NumPy-Discussion at scipy.org
>> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > NumPy-Discussion mailing list
>> >> > NumPy-Discussion at scipy.org
>> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >> >
>> >> >
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> >
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
More information about the NumPy-Discussion
mailing list