[AstroPy] astropy.table.Table groups: aggregate over & combine multiple columns?

Evert Rol evert.rol at gmail.com
Tue Jan 19 23:52:24 EST 2016

Is there a way in an astropy table to run the TableGroups aggregate function on multiple columns at once?

In this specific case, I'd like to group by names in one column, and then average the second column weighted by values in the third column. 
An example would be:

  from astropy.table import Table

  def average(col):
      # Manipulate multiple columns at once?
      return col.mean()

  t = Table([['a', 'a', 'a', 'b', 'b', 'c'],
             [1, 2, 3, 4, 5, 6],
             [2, 2, 1, 2, 1, 1]],
            names=('name', 'value', 'weight'))
  group = t.group_by('name')
  result = group.groups.aggregate(average)

which gives

  name value     weight
  ---- ----- -------------
     a   2.0 1.66666666667
     b   4.5           1.5
     c   6.0           1.0

which is not what I want.

In Pandas, this can be done with apply() on a groupby object, since that passes the relevant subsection the dataframe as input to the function.
So I can write:

  def average_pd(df):
      weight = df['weight']
      total = weight.sum()
      df['value'] *= weight / total
      df['value'] = df['value'].sum()
      df['weight'] = total  # for info; not necessary
      return df.iloc[0]  # ignore other rows: they are the same anyway

  df = t.to_pandas()
  result = df.groupby('name')[['value', 'weight']].apply(average_pd)

which gives:

         value  weight
a     1.800000       5
b     4.333333       3
c     6.000000       1

and 'value' consists of weighted averages.

(code also on https://gist.github.com/evertrol/12955a5d98edf055a2f4 )

Perhaps I overlooked some documentation, but I can't find if this can be done in astropy.table. Or do I just need to approach this differently?
Alternatively, should I convert & stick to Pandas for this type of functionality?


More information about the AstroPy mailing list