pandas read dataframe and sum all value same month and year
Diego Avesani
diego.avesani at gmail.com
Mon Feb 4 14:38:12 EST 2019
Deal all,
following Peter's suggestion,
I put the example code:
import pandas as pd
import numpy as np
from datetime import datetime
#input:
start_date = np.array(["2012-01-01 06:00",'2013-01-01 06:00','2014-01-01 06:00'])
end_date = np.array(["2013-01-01 05:00",'2014-01-01 05:00','2015-01-01 05:00'])
yearfolder = np.array(['2012','2013','2014'])
for ii in range(0, 1):
df = pd.read_csv('dati.csv',delimiter=',',header=0,parse_dates=True,na_values=-999)
df['datatime'] = df['datatime'].map(lambda x: datetime.strptime(str(x), "%Y-%m-%d %H:%M"))
mask = (df['datatime'] > str(start_date[ii])) & (df['datatime'] <= str(end_date[ii]))
df = df.loc[mask]
df = df.reset_index(drop=True)
#
df.groupby(pd.TimeGrouper('m')).sum()
and the example of file:
datatime,T,RH,PSFC,DIR,VEL10,PREC,RAD,CC,FOG
2012-01-01 06:00, 0.4,100, 911,321, 2.5, 0.0, 0, 0,0
2012-01-01 07:00, 0.8,100, 911,198, 0.8, 0.0, 0, 22,0
2012-01-01 08:00, 0.6,100, 912, 44, 1.2, 0.0, 30, 22,0
2012-01-01 09:00, 3.1, 76, 912, 22, 0.8, 0.0, 134, 44,0
2012-01-01 10:00, 3.4, 77, 912, 37, 0.5, 0.0, 191, 67,0
2012-01-01 11:00, 3.5,100, 912,349, 0.4, 0.0, 277, 44,0
2012-01-01 12:00, 3.6,100, 912, 17, 0.9, 0.0, 292, 22,0
2012-01-01 13:00, 3.5,100, 912, 28, 0.3, 0.0, 219, 44,0
2012-01-01 14:00, 3.3, 68, 912, 42, 0.5, 0.0, 151, 22,0
Hope this could help in finding a way to sum value belonging to the same month.
Thanks again, a lot
Diego
On Monday, 4 February 2019 15:50:52 UTC+1, Diego Avesani wrote:
> Dear all,
>
> I am reading the following data-frame:
>
> datatime,T,RH,PSFC,DIR,VEL10,PREC,RAD,CC,FOG
> 2012-01-01 06:00, -0.1,100, 815,313, 2.6, 0.0, 0, 0,0
> 2012-01-01 07:00, -1.2, 93, 814,314, 4.8, 0.0, 0, 0,0
> 2012-01-01 08:00, 1.7, 68, 815,308, 7.5, 0.0, 41, 11,0
> 2012-01-01 09:00, 2.4, 65, 815,308, 7.4, 0.0, 150, 33,0
> 2012-01-01 10:00, 3.0, 64, 816,305, 8.4, 0.0, 170, 44,0
> 2012-01-01 11:00, 2.6, 65, 816,303, 6.3, 0.0, 321, 22,0
> 2012-01-01 12:00, 2.0, 72, 816,278, 1.3, 0.0, 227, 22,0
> 2012-01-01 13:00, -0.0, 72, 816,124, 0.1, 0.0, 169, 22,0
> 2012-01-01 14:00, -0.1, 68, 816,331, 1.4, 0.0, 139, 33,0
> 2012-01-01 15:00, -4.0, 85, 816,170, 0.6, 0.0, 49, 0,0
> ....
> ....
>
> I read the data frame as:
>
> df = pd.read_csv('dati.csv',delimiter=',',header=0,parse_dates=True,na_values=-999)
> df['datatime'] = df['datatime'].map(lambda x: datetime.strptime(str(x), "%Y-%m-%d %H:%M"))
> #
> mask = (df['datatime'] > str(start_date[ii])) & (df['datatime'] <= str(end_date[ii]))
> df = df.loc[mask]
> df = df.reset_index(drop=True)
>
> I would to create an array with the sum of all the PREC value in the same month.
>
> I have tried with:
>
> df.groupby(pd.TimeGrouper('M')).sum()
>
> But as always, it seems that I have same problems with the indexes. Indeed, I get:
> 'an instance of %r' % type(ax).__name__)
> TypeError: axis must be a DatetimeIndex, but got an instance of 'Int64Index'
>
> thanks for any kind of help,
> Really Really thanks
>
> Diego
More information about the Python-list
mailing list