[Numpy-discussion] Advice on converting iterator into array efficiently
Alan Jackson
alan at ajackson.org
Thu Aug 28 20:57:09 EDT 2008
Looking for advice on a good way to handle this problem.
I'm dealing with large tables (Gigabyte large). I would like to
efficiently subset values from one column based on the values in
another column, and get arrays out of the operation. For example,
say I have 2 columns, "energy" and "collection". Collection is
basically an index that flags values that go together, so all the
energy values with a collection value of 18 belong together. I'd
like to be able to set up an iterator on collection that would
hand me an array of energy on each iteration :
if table is all my data, then something like
for c in table['collection'] :
e = c['energy']
... do array operations on e
I've been playing with pytables, and they help, but I can't quite
seem to get there. I can get an iterator for energy within a collection,
but I can't figure out an efficient way to get an array out.
What I have so far is
for h in np.unique(table.col('collection')) :
rows = table.where('collection == c')
for row in rows :
print c,' : ', row['energy']
but I really want to convert rows['energy'] to an array.
I've thought about building a nasty set of pointers and whatnot -
I did it once in perl - but I'm hoping to avoid that.
--
-----------------------------------------------------------------------
| Alan K. Jackson | To see a World in a Grain of Sand |
| alan at ajackson.org | And a Heaven in a Wild Flower, |
| www.ajackson.org | Hold Infinity in the palm of your hand |
| Houston, Texas | And Eternity in an hour. - Blake |
-----------------------------------------------------------------------
More information about the NumPy-Discussion
mailing list