Grouping & Collapsing multi-dimensional data
I might be using the wrong terminology but I'm trying to take a 2d array where each row has a department object and then 36 floats after it, eg: [dept1, 3,6,7...] With SQL or R i know how to collapse a simple 2d data structure like this. For example in SQL: select dept, stddev(field1)... from tbl_x group by dept But I want to end up with either a collapsed table grouped on dept where each element is some summary statistic, or better yet where each value is a dictionary of name=summary statistic and value = value of that statistic. Just to sort and perform this by group has been problematic because it gives me an error when sorting by an object if that object has __cmp__ method. I can do an index-based sort after using a field in the object -- but then must figure out the groups of rows manually. There must be easier methods to collapse on a dimension, grouping by one or more elements and applying arbitrary functions. Can this be done with numpy? Should i work with scipy? thanks - mike biglan ps: my example here is 2d but i'm hoping that this functionality would work for any-d
participants (1)
-
Mike Biglan