[Numpy-discussion] Grouping & Collapsing multi-dimensional data
Mike Biglan
mike at biglan.org
Tue Mar 20 13:44:58 EDT 2007
I might be using the wrong terminology but I'm trying to take a 2d
array where each row has a department object and then 36 floats after
it, eg: [dept1, 3,6,7...]
With SQL or R i know how to collapse a simple 2d data structure like
this. For example in SQL:
select dept, stddev(field1)... from tbl_x group by dept
But I want to end up with either a collapsed table grouped on dept
where each element is some summary statistic, or better yet where each
value is a dictionary of name=summary statistic and value = value of
that statistic.
Just to sort and perform this by group has been problematic because it
gives me an error when sorting by an object if that object has __cmp__
method. I can do an index-based sort after using a field in the
object -- but then must figure out the groups of rows manually. There
must be easier methods to collapse on a dimension, grouping by one or
more elements and applying arbitrary functions. Can this be done with
numpy? Should i work with scipy?
thanks -
mike biglan
ps: my example here is 2d but i'm hoping that this functionality would
work for any-d
More information about the NumPy-Discussion
mailing list