[Pandas-dev] Mean, stdev, and var for array elements
Tyler Hardin
th020394 at gmail.com
Sat Oct 7 01:55:37 EDT 2017
Hi,
I'd really like to be able to calculate the mean, stdev, and var across
within cells of a dataframe. It already works as I expect it to with sum.
Example:
import pandas as pd
a = pd.Series([1, 2, 3, 4]) * 1.
b = pd.Series([1, 2, 3, 4]) * 2.
c = pd.Series([1, 2, 3, 4]) * 3.
d = pd.Series([1, 2, 3, 4]) * 4.
df = pd.DataFrame({'a' : [a,b,c,d]}, index=[0, 1, 2, 3])
print(df.a.sum())
Output:
0 10.0
1 20.0
2 30.0
3 40.0
dtype: float64
This is very useful for embedding a third dimension within a single column
(because it's only needed there) instead of going full multi-index.
For example, say you have a dataframe indexed on (date, stock) and in the
dataframe you have columns for close pnl, close gmv, etc. Further, say you
have a pnl_curve column, a minute-indexed (intraday) timeseries (again,
unique per date, stock). As in, each (date, stock) has an associated
intraday pnl curve (pd.Series object) in the column.
>From that setup, I want to reduce away the stock dimension. I might want to
sum the pnl curves (to get overall intraday pnl curves for each date). This
actually works already. (As simple as df.pnl_curve.sum()). But I'd also
like to plot the mean pnl and std bands around that. Neither mean nor std
work for this.
Can someone implement these functions for series, or help me do it right?
Or is there a better way?
It seems the implementation for mean is as simple as removing
_ensure_numeric in core/nanops.py. As for nanvar, I'm really not sure how
to 1) use numpy functions to calculate what I need and 2) extend the
function to accept dtype object without making it more likely to give
cryptic errors when someone accidentally uses it with objects. (E.g. Pandas
seems to be careful to throw meaningful Value and TypeErrors when it can.
Amateurishly loosing restrictions defeats that.)
Regards,
Tyler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20171007/9ca730b6/attachment.html>
More information about the Pandas-dev
mailing list