[AstroPy] astropy.table.Table groups: aggregate over & combine multiple columns?
Evert Rol
evert.rol at gmail.com
Tue Jan 19 23:52:24 EST 2016
Is there a way in an astropy table to run the TableGroups aggregate function on multiple columns at once?
In this specific case, I'd like to group by names in one column, and then average the second column weighted by values in the third column.
An example would be:
from astropy.table import Table
def average(col):
# Manipulate multiple columns at once?
return col.mean()
t = Table([['a', 'a', 'a', 'b', 'b', 'c'],
[1, 2, 3, 4, 5, 6],
[2, 2, 1, 2, 1, 1]],
names=('name', 'value', 'weight'))
group = t.group_by('name')
result = group.groups.aggregate(average)
print(result)
which gives
name value weight
---- ----- -------------
a 2.0 1.66666666667
b 4.5 1.5
c 6.0 1.0
which is not what I want.
In Pandas, this can be done with apply() on a groupby object, since that passes the relevant subsection the dataframe as input to the function.
So I can write:
def average_pd(df):
weight = df['weight']
total = weight.sum()
df['value'] *= weight / total
df['value'] = df['value'].sum()
df['weight'] = total # for info; not necessary
return df.iloc[0] # ignore other rows: they are the same anyway
df = t.to_pandas()
result = df.groupby('name')[['value', 'weight']].apply(average_pd)
print(result)
which gives:
value weight
name
a 1.800000 5
b 4.333333 3
c 6.000000 1
and 'value' consists of weighted averages.
(code also on https://gist.github.com/evertrol/12955a5d98edf055a2f4 )
Perhaps I overlooked some documentation, but I can't find if this can be done in astropy.table. Or do I just need to approach this differently?
Alternatively, should I convert & stick to Pandas for this type of functionality?
Evert
More information about the AstroPy
mailing list