Mailman 3 how to get only complete years from series? - SciPy-User

how to get only complete years from series?

Timmie

17 Nov 2008 17 Nov '08

10:34 p.m.

Hello, I am unsing the scikit.timeseries to evaluate a long-term measurement data set. How can I extract those years, which have complete measurements? In the below, years 2004 & 2008 are not complete. Is there a generic possibility that all incomplete years get masked? Thanks & regards, Timmie ###code import numpy as np import numpy.ma as ma import scikits.timeseries as ts data = np.arange(0, 40800) start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) s_all = ts.time_series(data, freq='H', start_date=start_dt)

Show replies by date

Pierre GM

18 Nov 18 Nov

6:15 p.m.

New subject: [SciPy-user] how to get only complete years from series?

Timmie, There's no generic function to perform what you want as it'll depend on the frequency. What you can do is: 1. get a list of years

...

...
...
singleyears = set(s_all.years)

2. for each year, check what are the first and last days of the year:

...

...
...
firstandlast = [tuple([year] +s_all[s_all.years==year].yeardays[[0,-1]].tolist()) for year in singleyears]

That gives you a list of tuples (year, first day, last day) 3. find the years for which the first day is strictly larger than 1 and the last strictly lower than 365.

...

...
...
maskyears = [y for (y,f,l) in firstandlast if f>1 or l<365]

4. Mask the corresponding years

...

...
...
for y in maskyears: s_all[s_all.years==y] = ma.masked

That's far from efficient and rather ugly, but that should give you a generic idea. Let me know how it goes. P. On Nov 17, 2008, at 3:34 PM, Timmie wrote:

...

Hello, I am unsing the scikit.timeseries to evaluate a long-term measurement data set.

How can I extract those years, which have complete measurements?

In the below, years 2004 & 2008 are not complete. Is there a generic possibility that all incomplete years get masked?

Thanks & regards, Timmie

###code

import numpy as np import numpy.ma as ma import scikits.timeseries as ts

data = np.arange(0, 40800) start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) s_all = ts.time_series(data, freq='H', start_date=start_dt)

_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user

Pierre GM

9:23 p.m.

New subject: [SciPy-user] how to get only complete years from series?

Timmie, There's smarter than the previous answer, if you're not afraid of temporary arrays. Here's a copy-pasted version, commented. Let me know how it goes. Cheers P. #### BELOW A SAMPLE SCRIPT THAT MAY ILLUSTRATE #### #!/usr/bin/env python # -*- coding: utf-8 -*- import datetime import scikits.timeseries as ts import numpy as np #import numpy as np import numpy.ma as ma import scikits.timeseries as ts data = np.arange(0, 40800) start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) s_all = ts.time_series(data, freq='H', start_date=start_dt) # Convert to a (5,24*366) annual series: each row is a year, each column an hour # Because of lapse years, we have 24*366 cols, not 24*365 a_s_all = s_all.convert('A') # If the first column (the first date) is masked, mask the row. a_s_all[a_s_all[:,0].mask] = ma.masked # If the column -25 (last hour of 12/31 or 12/30) is masked, masked the column a_s_all[a_s_all[:,-25].mask] = ma.masked # Make a new series from the annual series. # We can't us convert because the annual series is 2D. # Instead, we create a new series starting at the first date of the annual series, # converted to the correct frequency (s_all.freq). # As the method asfreq defaults to END, we need to force 'START' for relation # (check the docstring of asfreq). starting_date = a_s_all.dates[0].asfreq(s_all.freq, relation='START') # For the data, we can't use a_s_all.ravel() directly because a_s_all is 2D, # but we only need the data actually, not the dates. s_new = ts.time_series(a_s_all._series.ravel(), start_date=starting_date) # And if you want, you can force the starting and ending dates of this new series # to the initial ones s_mod = ts.align_with(s_all, s_new) On Nov 17, 2008, at 3:34 PM, Timmie wrote:

...

Hello, I am unsing the scikit.timeseries to evaluate a long-term measurement data set.

How can I extract those years, which have complete measurements?

In the below, years 2004 & 2008 are not complete. Is there a generic possibility that all incomplete years get masked?

Thanks & regards, Timmie

###code

import numpy as np import numpy.ma as ma import scikits.timeseries as ts

data = np.arange(0, 40800) start_dt = ts.Date(freq='H', year=2004, month=3, day=1, hour=0) s_all = ts.time_series(data, freq='H', start_date=start_dt)

_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user

5708

Age (days ago)

5709

Last active (days ago)

List overview

Download

2 comments

2 participants

participants (2)

Pierre GM
Timmie

how to get only complete years from series?

Timmie

Pierre GM

Pierre GM

tags

participants (2)