[AstroPy] astropy.table.Table groups: aggregate over & combine multiple columns?
evert.rol at gmail.com
Tue Jan 19 23:52:24 EST 2016
Is there a way in an astropy table to run the TableGroups aggregate function on multiple columns at once?
In this specific case, I'd like to group by names in one column, and then average the second column weighted by values in the third column.
An example would be:
from astropy.table import Table
# Manipulate multiple columns at once?
t = Table([['a', 'a', 'a', 'b', 'b', 'c'],
[1, 2, 3, 4, 5, 6],
[2, 2, 1, 2, 1, 1]],
names=('name', 'value', 'weight'))
group = t.group_by('name')
result = group.groups.aggregate(average)
name value weight
---- ----- -------------
a 2.0 1.66666666667
b 4.5 1.5
c 6.0 1.0
which is not what I want.
In Pandas, this can be done with apply() on a groupby object, since that passes the relevant subsection the dataframe as input to the function.
So I can write:
weight = df['weight']
total = weight.sum()
df['value'] *= weight / total
df['value'] = df['value'].sum()
df['weight'] = total # for info; not necessary
return df.iloc # ignore other rows: they are the same anyway
df = t.to_pandas()
result = df.groupby('name')[['value', 'weight']].apply(average_pd)
a 1.800000 5
b 4.333333 3
c 6.000000 1
and 'value' consists of weighted averages.
(code also on https://gist.github.com/evertrol/12955a5d98edf055a2f4 )
Perhaps I overlooked some documentation, but I can't find if this can be done in astropy.table. Or do I just need to approach this differently?
Alternatively, should I convert & stick to Pandas for this type of functionality?
More information about the AstroPy