[SciPy-User] calculating the mean for each factor (like tapply in R)

Oleksandr Huziy guziy.sasha at gmail.com
Wed Aug 1 09:35:53 EDT 2012


Hi,

It is pretty much the same as looping, but you could do the following

In [1]: import numpy as np

In [2]: exps = np.array([10,13,12,3,4,6,33,44,55])

In [3]: x = np.array([10,13,12,3,4,6,33,44,55])

In [4]: exps = np.array([1,1,1,2,2,2,3,3,3])

z = [np.mean(x[exps == i]) for i in np.unique( exps )]

--
Oleksandr (Sasha) Huziy

2012/8/1 Andreas Hilboll <lists at hilboll.de>

> > Hi there,
> >
> > I've just moved from R to IPython and wondered if there was a good way of
> > finding the means and/or variance of values in a dataframe given a factor
> >
> > e.g.:
> > if df =
> > x             experiment
> > 10            1
> > 13            1
> > 12            1
> > 3             2
> > 4             2
> > 6             2
> > 33            3
> > 44            3
> > 55            3
> >
> > in tapply you would do:
> >
> > tapply(df$x, list(df$experiment), mean)
> > tapply(df$x, list(df$experiment), var)
> >
> > I guess I can always loop through the array for each experiment type, but
> > thought that this is the kind of functionality that would be included in
> a
> > core library.
>
> Pandas (http://pandas.pydata.org/) seems to be what you're looking for. It
> has a DataFrame class which allows grouping of data.
>
> Cheers, Andreas.
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120801/ab05d346/attachment.html>


More information about the SciPy-User mailing list