[Tutor] using datetime and calculating hourly average

Skipper Seabold jsseabold at gmail.com
Tue Jul 7 18:29:56 CEST 2009


On Tue, Jul 7, 2009 at 6:16 AM, John [H2O]<washakie at gmail.com> wrote:
>
> Here's a function I wrote to calculate hourly averages:
>
> It seems a bit slow, however... any thoughts on how to improve it?
>
> def calc_hravg(X):
>    """Calculates hourly average from input data"""
>
>    X_hr = []
>    minX = X[:,0].min()
>    hr = dt.datetime(*minX.timetuple()[0:4])
>
>    while hr <= dt.datetime(*X[-1,0].timetuple()[0:4]):
>        nhr = hr + dt.timedelta(hours=1)
>        ind = np.where( (X[:,0] > hr) & (X[:,0] < nhr) )
>        vals = X[ind,1][0].T
>        try:
>            #hr_avg = np.sum(vals) / len(vals)
>            hr_avg = np.average(vals)
>
>        except:
>            hr_avg = np.nan
>        X_hr.append([hr,hr_avg])
>        hr = hr + dt.timedelta(hours=1)
>
>    return np.array(X_hr)
>
>
> --

One quick thought, as I haven't read your code very carefully, but
using reduce is faster than sum or average (though you sacrifice
readability) if the ndarray is big enough to matter ie., instead of
np.average(vals) you could have np.add.reduce(vals)/len(vals).  You
might have some better luck on the numpy mailing list
<http://www.scipy.org/Mailing_Lists>.  It's very active and there are
people that are much more knowledgeable than me.  You might want to
include an example of your X in the example to help them help you to
optimize.

Cheers,
Skipper


More information about the Tutor mailing list