[SciPy-User] Select rows according to cell value

Tue Nov 13 16:19:33 EST 2012

Oh Wes, you showed me the light :)

On Tue, Nov 13, 2012 at 9:56 PM, Wes McKinney <wesmckinn at gmail.com> wrote:

> On Tue, Nov 13, 2012 at 12:53 PM, Scott Lasley <slasley at space.umd.edu>
> wrote:
> > I don't know if this is any less ugly, but you could use
> >
> > magic_indices = lambda a: np.any([data[:,0] == x for x in a],axis=0)
> > d = data[magic_indices(altitudes)]
> >
> > Or, as Oleksandr pointed out, if you're concerned about comparing floats
> you might use
> >
> > tol = 0.01
> > magic_indices = lambda a: np.any([np.abs(data[:,0] - x) < tol for x in
> a],axis=0)
> > d = data[magic_indices(altitudes)]
> >
> > or just
> >
> > tol = 0.01
> > d = data[np.any([np.abs(data[:,0] - x) < tol for x in altitudes],axis=0)]
> >
> > Scott
> >
> > On Nov 13, 2012, at 12:41 PM, Juan Luis Cano Rodríguez <
> juanlu001 at gmail.com> wrote:
> >
> >> Actually I arrived to a couple of one-liners:
> >>
> >>     d = np.take(data, [np.argwhere(data[:, 0] == a).flatten()[0] for a
> in altitudes], axis=0)
> >>
> >> or
> >>
> >>     d = np.array([data[data[:, 0] == a][0] for a in altitudes])
> >>
> >> I find them sort of ugly but maybe it's the only way. The same way I'd
> do
> >>
> >>     data[[1, 3, 8]]
> >>
> >> to retrieve the first, third and eighth I'd like to do
> >>
> >>     data[np.magic_indices(altitudes)]
> >>
> >>
> >> On Tue, Nov 13, 2012 at 6:02 PM, Oleksandr Huziy <guziy.sasha at gmail.com>
> wrote:
> >> Yeps, I admit with pandas it appears much easier
> >>
> >> import pandas
> >> df = pandas.read_csv("tmp/data.txt", sep="\\s")
> >> df = df.dropna(axis = 1)
> >>
> >> #df.index = df["alt"]
> >> selection = df.select(lambda i: df.ix[i, "alt"] in altitudes)
> >> print selection
> >>
> >>
> >> cheers
> >> --
> >> Oleksandr (Sasha) Huziy
> >>
> >>
> >>
> >> 2012/11/13 Oleksandr Huziy <guziy.sasha at gmail.com>
> >> I am not sure if this way is easier thsn yours, but here is what I
> wpuld do
> >>
> >> tol = 0.01
> >> all_alts = data[:,0]
> >> print all_alts
> >> all_alts_temp = np.vstack([all_alts]*len(altitudes))
> >> print all_alts_temp
> >>
> >> sel_alts_temp = np.vstack([altitudes]*len(all_alts)).transpose()
> >> print sel_alts_temp
> >> sel_pattern = np.any( np.abs(all_alts_temp - sel_alts_temp) < tol, axis
> = 0)
> >> print sel_pattern
> >> print data
> >> print data[sel_pattern,:]
> >>
> >>
> >> Cheers
> >> --
> >> Oleksandr (Sasha) Huziy
> >>
> >>
> >>
> >>
> >> 2012/11/13 Andreas Hilboll <lists at hilboll.de>
> >> Am Di 13 Nov 2012 17:07:19 CET schrieb Juan Luis Cano Rodríguez:
> >> > I am loading some tabular data of the form
> >> >
> >> >   alt    temp    press    dens
> >> >   10.0    223.3    26500    0.414
> >> >   10.5    220.0    24540    0.389
> >> >   11.0    216.8    22700    0.365
> >> >   11.5    216.7    20985    0.337
> >> >   12.0    216.7    19399    0.312
> >> >   12.5    216.7    17934    0.288
> >> >   13.0    216.7    16579    0.267
> >> >   13.5    216.7    15328    0.246
> >> >   14.0    216.7    14170    0.228
> >> >
> >> > into an ordinary NumPy array using np.loadtxt. I would like though to
> >> > select the rows according to the altitude level, that is:
> >> >
> >> >     >>> data = np.loadtxt('data.txt', skiprows=1)
> >> >     >>> altitudes = [10.5, 11.5, 14.0]
> >> >     >>> d = ...  # some simple syntax involving data and altitudes
> >> >     >>> d
> >> >     10.5    220.0    24540    0.389
> >> >     11.5    216.7    20985    0.337
> >> >     14.0    216.7    14170    0.228
> >> >
> >> > I have tried a cumbersome expression which traverses all the array,
> >> > then uses a list comprehension, converts to an array... but I'm sure
> >> > there must be a simpler way. I've also looked at argwhere. Or maybe I
> >> > should use pandas?
> >> >
> >> > Thank you in advance.
> >> >
> >> >
> >> > _______________________________________________
> >> > SciPy-User mailing list
> >> > SciPy-User at scipy.org
> >> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >>
> >> +1 for using pandas
> >> _______________________________________________
> >> SciPy-User mailing list
> >> SciPy-User at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>
> >>
> >>
> >> _______________________________________________
> >> SciPy-User mailing list
> >> SciPy-User at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>
> >>
> >> _______________________________________________
> >> SciPy-User mailing list
> >> SciPy-User at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
>
> Use isin:
>
> In [5]: df[df.alt.isin(altitudes)]
> Out[5]:
>     alt   temp  press   dens
> 1  10.5  220.0  24540  0.389
> 3  11.5  216.7  20985  0.337
> 8  14.0  216.7  14170  0.228
>
> - Wes
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20121113/9153ce1a/attachment.html>