[Numpy-discussion] difficulty with numpy.where

Gökhan Sever gokhansever at gmail.com
Thu Oct 1 13:48:33 EDT 2009


On Thu, Oct 1, 2009 at 12:10 PM, Zachary Pincus <zachary.pincus at yale.edu>wrote:

> Hello,
>
> a < b < c (or any equivalent expression) is python syntactic sugar for
> (a < b) and (b < c).
>
> Now, for numpy arrays, a < b gives an array with boolean True or False
> where the elements of a are less than those of b. So this gives us two
> arrays that python now wants to "and" together. To do this, python
> tries to convert the array "a < b" to a single True or False value,
> and the array "b < c" to a single True or False value, which it then
> knows how to "and" together. Except that "a < b" could contain many
> True or False elements, so how to convert them to a single one?
> There's no obvious way to guess -- typically, one uses "any" or "all"
> to convert a boolean array to a single true or false value, depending,
> obviously, on what one needs.
>
> So this explains the error you see, but has nothing to do with the
> results you desire... you need to and-together two boolean arrays
> *element-wise* -- which is something Python doesn't know how to do
> with the builtin "and" operator (which cannot be overridden). To do
> this, you need to use the bitwise logic operators:
> (a < b) & (b < c).
>
> So:
>
> def sin_half_period(x): return where((0.0 <= x) & (x <= pi), sin(x),
> 0.0)
>
> Zach
>
>
>
Very well expressed Zach.

The reason that I wanted use this kind of conditional indexing is as
follows: I have a dataset with a main time-variable and various other
measurement results including some atmospheric data (cloud microphysics in
particular). In one instance of this dataset I have 8000 something rows for
each of the variables in the file. We wanted to segment cloud droplet
concentration data only for some certain time-window (only if a measurement
was done at cloud base conditions.) We have a-priori knowledge for this
time-window, the only other thing to do is conditionally indexing our cloud
drop concentration with this window. Putting in more technical terms:

time = 40000 to 48000 a numpy array
conc = 300 to 500 numpy array with 8000 elements.

say that cloud bases occur in 45000 and 45400, and I am only interested
analysing that portion of the data. Do a boxplot or even being fancier and
making violing plots out this section :) So I do:

conc[(time>45000) & (time<45400)]

Voila!





>
> On Oct 1, 2009, at 12:55 PM, Dr. Phillip M. Feldman wrote:
>
> >
> > I've defined the following one-line function that uses numpy.where:
> >
> > def sin_half_period(x): return where(0.0 <= x <= pi, sin(x), 0.0)
> >
> > When I try to use this function, I get an error message:
> >
> > In [4]: z=linspace(0,2*pi,9)
> > In [5]: sin_half_period(z)
> >
> ---------------------------------------------------------------------------
> > ValueError                                Traceback (most recent
> > call last)
> >
> > The truth value of an array with more than one element is ambiguous.
> > Use
> > a.any
> > () or a.all()
> >
> > Any suggestions will be appreciated.
> > --
> > View this message in context:
> http://www.nabble.com/difficulty-with-numpy.where-tp25702676p25702676.html
> > Sent from the Numpy-discussion mailing list archive at Nabble.com.
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
Gökhan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20091001/6ede855b/attachment.html>


More information about the NumPy-Discussion mailing list