[Tutor] group txt files by month

Peter Otten __peter__ at web.de
Thu Apr 5 08:57:27 CEST 2012


questions anon wrote:

> I have been able to write up what I want to do (using glob) but I am not
> sure how to loop it or simplify it to make the script more efficient.
> I am currently:
> -grouping the same months in a year using glob
> -opening the files in a group and combining the data using a list
> -finding max, min etc for the list and printing it
> 
> I need to do this for many years and therefore many months so really need
> a way to make this more efficient.
> Any feedback will be greatly appreciated
> 
> MainFolder=r"E:/rainfall-2011/"
> OutputFolder=r"E:/test_out/"
> r201101=glob.glob(MainFolder+"r201101??.txt")
> r201102=glob.glob(MainFolder+"r201102??.txt")
> r201103=glob.glob(MainFolder+"r201103??.txt")
> 
> rain201101=[]
> rain201102=[]
> rain201103=[]
> monthlyrainfall=[]
> 
> for ifile in r201101:
>     f=np.genfromtxt(ifile, skip_header=6)
>     rain201101.append(f)
> 
> for ifile in r201102:
>     f=np.genfromtxt(ifile, skip_header=6)
>     rain201102.append(f)
> 
> for ifile in r201103:
>     f=np.genfromtxt(ifile, skip_header=6)
>     rain201103.append(f)
> 
> print "jan", np.max(rain201101), np.min(rain201101), np.mean(rain201101),
> np.median(rain201101), np.std(rain201101)
> print "feb", np.max(rain201102), np.min(rain201102), np.mean(rain201102),
> np.median(rain201102), np.std(rain201102)
> print "mar", np.max(rain201103), np.min(rain201103), np.mean(rain201103),
> np.median(rain201103), np.std(rain201103)

Strip the code down to one month

> r201103=glob.glob(MainFolder+"r201103??.txt")
> rain201101=[]
> for ifile in r201101:
>     f=np.genfromtxt(ifile, skip_header=6)
>     rain201101.append(f)


then turn it into a function, roughly

GLOBTEMPLATE = "e:/rainfall-{year}/r{year}{month:02}??.txt"
def accumulate_month(year, month):
    files = glob.glob(GLOBTEMPLATE.format(year=year, month=month))
    # read files, caculate and write stats

and finally put it into a loop:

from datetime import date, timedelta
stop_month = date(2012, 4, 1)
month = datetime(2011, 1, 1)
while month < stop_month:
    accumulate_month(month.year, month.month)
    month += timedelta(days=32)
    month = month.replace(day=1)






More information about the Tutor mailing list