[SciPy-User] Select rows according to cell value
Wes McKinney
wesmckinn at gmail.com
Tue Nov 13 15:56:55 EST 2012
On Tue, Nov 13, 2012 at 12:53 PM, Scott Lasley <slasley at space.umd.edu> wrote:
> I don't know if this is any less ugly, but you could use
>
> magic_indices = lambda a: np.any([data[:,0] == x for x in a],axis=0)
> d = data[magic_indices(altitudes)]
>
> Or, as Oleksandr pointed out, if you're concerned about comparing floats you might use
>
> tol = 0.01
> magic_indices = lambda a: np.any([np.abs(data[:,0] - x) < tol for x in a],axis=0)
> d = data[magic_indices(altitudes)]
>
> or just
>
> tol = 0.01
> d = data[np.any([np.abs(data[:,0] - x) < tol for x in altitudes],axis=0)]
>
> Scott
>
> On Nov 13, 2012, at 12:41 PM, Juan Luis Cano Rodríguez <juanlu001 at gmail.com> wrote:
>
>> Actually I arrived to a couple of one-liners:
>>
>> d = np.take(data, [np.argwhere(data[:, 0] == a).flatten()[0] for a in altitudes], axis=0)
>>
>> or
>>
>> d = np.array([data[data[:, 0] == a][0] for a in altitudes])
>>
>> I find them sort of ugly but maybe it's the only way. The same way I'd do
>>
>> data[[1, 3, 8]]
>>
>> to retrieve the first, third and eighth I'd like to do
>>
>> data[np.magic_indices(altitudes)]
>>
>>
>> On Tue, Nov 13, 2012 at 6:02 PM, Oleksandr Huziy <guziy.sasha at gmail.com> wrote:
>> Yeps, I admit with pandas it appears much easier
>>
>> import pandas
>> df = pandas.read_csv("tmp/data.txt", sep="\\s")
>> df = df.dropna(axis = 1)
>>
>> #df.index = df["alt"]
>> selection = df.select(lambda i: df.ix[i, "alt"] in altitudes)
>> print selection
>>
>>
>> cheers
>> --
>> Oleksandr (Sasha) Huziy
>>
>>
>>
>> 2012/11/13 Oleksandr Huziy <guziy.sasha at gmail.com>
>> I am not sure if this way is easier thsn yours, but here is what I wpuld do
>>
>> tol = 0.01
>> all_alts = data[:,0]
>> print all_alts
>> all_alts_temp = np.vstack([all_alts]*len(altitudes))
>> print all_alts_temp
>>
>> sel_alts_temp = np.vstack([altitudes]*len(all_alts)).transpose()
>> print sel_alts_temp
>> sel_pattern = np.any( np.abs(all_alts_temp - sel_alts_temp) < tol, axis = 0)
>> print sel_pattern
>> print data
>> print data[sel_pattern,:]
>>
>>
>> Cheers
>> --
>> Oleksandr (Sasha) Huziy
>>
>>
>>
>>
>> 2012/11/13 Andreas Hilboll <lists at hilboll.de>
>> Am Di 13 Nov 2012 17:07:19 CET schrieb Juan Luis Cano Rodríguez:
>> > I am loading some tabular data of the form
>> >
>> > alt temp press dens
>> > 10.0 223.3 26500 0.414
>> > 10.5 220.0 24540 0.389
>> > 11.0 216.8 22700 0.365
>> > 11.5 216.7 20985 0.337
>> > 12.0 216.7 19399 0.312
>> > 12.5 216.7 17934 0.288
>> > 13.0 216.7 16579 0.267
>> > 13.5 216.7 15328 0.246
>> > 14.0 216.7 14170 0.228
>> >
>> > into an ordinary NumPy array using np.loadtxt. I would like though to
>> > select the rows according to the altitude level, that is:
>> >
>> > >>> data = np.loadtxt('data.txt', skiprows=1)
>> > >>> altitudes = [10.5, 11.5, 14.0]
>> > >>> d = ... # some simple syntax involving data and altitudes
>> > >>> d
>> > 10.5 220.0 24540 0.389
>> > 11.5 216.7 20985 0.337
>> > 14.0 216.7 14170 0.228
>> >
>> > I have tried a cumbersome expression which traverses all the array,
>> > then uses a list comprehension, converts to an array... but I'm sure
>> > there must be a simpler way. I've also looked at argwhere. Or maybe I
>> > should use pandas?
>> >
>> > Thank you in advance.
>> >
>> >
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>> +1 for using pandas
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
Use isin:
In [5]: df[df.alt.isin(altitudes)]
Out[5]:
alt temp press dens
1 10.5 220.0 24540 0.389
3 11.5 216.7 20985 0.337
8 14.0 216.7 14170 0.228
- Wes
More information about the SciPy-User
mailing list