scikits.timeseries DateArray Question
![](https://secure.gravatar.com/avatar/a4b6fc6884b391a7f10b8bd45c8caa69.jpg?s=120&d=mm&r=g)
Hi, Thanks to Pierre's suggestion of using virtualenv package, I now have a working install of the scikit.timeseries package installed. I have some questions about constructing a timeseries from data in a file. From the file I am reading Year,Month,Day,Hour,Min,Data and I need to convert them to a timeseries. I've been looking through the documentation and the mailing list archive and I'm not sure how to create a DateArray that contains a non uniform list of datetimes. The data is mostly at a 15 minute frequency but sometimes may not fall exactly at 00,15,30,45 mins etc and other times may not be present. All the examples I've seen involve data that is present in a fixed frequency. ie. #Year Month Day Hour Min Data 2008 01 01 10 00 2.9 2008 01 01 10 15 3.2 2008 01 01 10 33 3.1 2008 01 01 12 45 3.0 2008 01 02 11 15 3.4 ... Is there a way to read these dates into a DateArray so I can create a timeseries? thanks, - dharhas
![](https://secure.gravatar.com/avatar/56b215661867f3b4f4a3b28077de66b3.jpg?s=120&d=mm&r=g)
Dharhas, The documentation is a bit scarce indeed, and some functions are being rewritten (eg, loadtxt). For now, here's what you can do: *First, load your data into an array with np.loadtxt, matplotlib.mlab.csv2rec, whatever.
loaded = np.loadtxt(...)
As in your example, we'll assume that the array in 6 cols wide, the first five being year, month, day, hour and min and the last one some data. No missing values in any of the first 5 cols, or find a way to fill them. Because your data are every 15 min or so, we need to use a 'minute' frequency (code 'T' or 'MIN'). There might be gaps in dates, that's OK as long as the whole line is missing. For now, let's use
loaded = [(2008, 1, 1, 12, 0, 1.0), (2008, 1, 1, 12, 15, 2.0), (2008, 1, 1, 18, 0, 3.0)]
* Then, construct a DateArray from those first 5 cols. The simplest is to rely on datetime for that:
import scikits.timeseries as ts import datetime dates = ts.date_array([datetime.datetime(yy,mm,dd,hh,nn) for (yy,mm,dd,hh,nn,_) in loaded], freq='MIN')
* Now, construct your time series
series = ts.time_series([_[-1] for _ in loaded, dates=dates)
Let me know how it goes. P. On Dec 3, 2008, at 12:16 PM, Dharhas Pothina wrote:
Hi,
Thanks to Pierre's suggestion of using virtualenv package, I now have a working install of the scikit.timeseries package installed. I have some questions about constructing a timeseries from data in a file. From the file I am reading Year,Month,Day,Hour,Min,Data and I need to convert them to a timeseries. I've been looking through the documentation and the mailing list archive and I'm not sure how to create a DateArray that contains a non uniform list of datetimes. The data is mostly at a 15 minute frequency but sometimes may not fall exactly at 00,15,30,45 mins etc and other times may not be present. All the examples I've seen involve data that is present in a fixed frequency.
ie.
#Year Month Day Hour Min Data 2008 01 01 10 00 2.9 2008 01 01 10 15 3.2 2008 01 01 10 33 3.1 2008 01 01 12 45 3.0 2008 01 02 11 15 3.4 ...
Is there a way to read these dates into a DateArray so I can create a timeseries?
thanks,
- dharhas
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/6f3cb304671ae5b6ea04dfe0e7948651.jpg?s=120&d=mm&r=g)
Hello Dharhas, welcome as new user of timeseries user! Learning this scikit will soon pay off. I have seen a huge boost in the simplicity and usability of my analysis through the code I wrote using the timeseries. A special praise shall be given to the developers Pierre & Matt. By patiently answering my "advanced python newbie" questions they really help me to get the maximum of the numpy.ma and scikits.timeseries tool box. This did really bring my numerical python coding forward.
The documentation is a bit scarce indeed, and some functions are being rewritten (eg, loadtxt). For now, here's what you can do: I tried to add some of my questions to the cookbook. Here is one that may help you in the current situation:
More extensive answer - http://www.scipy.org/Cookbook/TimeSeries/FAQ#head-9f5c8c4d4aa0de90c9851b972d... All other Q&A in form of mails sent in by other users and myself are at gmane: * creating timeseries for non convertional custom frequencies - http://article.gmane.org/gmane.comp.python.scientific.user/15688 Search for "time series" or timeseries http://search.gmane.org/?query=%22time+series%22+timeseries&author=&group=gmane.comp.python.scientific.user&sort=relevance&DEFAULTOP=or&xP=time%09series&xFILTERS=Gcomp.python.scientific.user---A Some answers to feature requests may also help: http://scipy.org/scipy/scikits/query?status=new&status=assigned&status=reopened&status=closed&component=timeseries&order=priority @Pierre, Are there plans to include timeseries into the scipy online doc editor? What for do you suggest if I would like to contribute examples here and there?
* Now, construct your time series
series = ts.time_series([_[-1] for _ in loaded, dates=dates) after this step you'd probably want to fill the missing dates: series_filled = series.fill_missing_dates
=> now you can save the data to csv using reportlib from the scikit and to other neat things. Hope that helps. Regards, Timmie
![](https://secure.gravatar.com/avatar/a4b6fc6884b391a7f10b8bd45c8caa69.jpg?s=120&d=mm&r=g)
Tim Michelsen <timmichelsen@gmx-topmail.de> 12/3/2008 3:18 PM >>> Hello Dharhas, welcome as new user of timeseries user! Learning this scikit will soon pay off. I have seen a huge boost in the simplicity and usability of my analysis
Thank you Tim. I've been following this package for a while. It looks really impressive. The only thing that was holding me back was installation issues and the virtualenv stuff have fixed that. A question. Why do I need to fill missing dates? Is it required for other things like calculating daily averages etc or is there another reason? @Pierre & Matt. Please don't my earlier emails as criticism about the documentation. I am extremely thankful that you have taken the time to develop this package. Seconding Tim, I would like to contribute examples/howto's based on the work I'm doing. If you have any guidance on how the best way to do this is that would be great. - dharhas through the code I wrote using the timeseries. A special praise shall be given to the developers Pierre & Matt. By patiently answering my "advanced python newbie" questions they really help me to get the maximum of the numpy.ma and scikits.timeseries tool box. This did really bring my numerical python coding forward.
The documentation is a bit scarce indeed, and some functions are being rewritten (eg, loadtxt). For now, here's what you can do: I tried to add some of my questions to the cookbook. Here is one that may help you in the current situation:
More extensive answer - http://www.scipy.org/Cookbook/TimeSeries/FAQ#head-9f5c8c4d4aa0de90c9851b972d... All other Q&A in form of mails sent in by other users and myself are at gmane: * creating timeseries for non convertional custom frequencies - http://article.gmane.org/gmane.comp.python.scientific.user/15688 Search for "time series" or timeseries http://search.gmane.org/?query=%22time+series%22+timeseries&author=&group=gmane.comp.python.scientific.user&sort=relevance&DEFAULTOP=or&xP=time%09series&xFILTERS=Gcomp.python.scientific.user---A Some answers to feature requests may also help: http://scipy.org/scipy/scikits/query?status=new&status=assigned&status=reopened&status=closed&component=timeseries&order=priority @Pierre, Are there plans to include timeseries into the scipy online doc editor? What for do you suggest if I would like to contribute examples here and there?
* Now, construct your time series
series = ts.time_series([_[-1] for _ in loaded, dates=dates) after this step you'd probably want to fill the missing dates: series_filled = series.fill_missing_dates
=> now you can save the data to csv using reportlib from the scikit and to other neat things. Hope that helps. Regards, Timmie _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/56b215661867f3b4f4a3b28077de66b3.jpg?s=120&d=mm&r=g)
On Dec 3, 2008, at 4:40 PM, Dharhas Pothina wrote:
A question. Why do I need to fill missing dates? Is it required for other things like calculating daily averages etc or is there another reason?
Well, it is required in some operations, in particular conversion from one frequency to another. If you don't get any error message about the dates being incomplete, you're OK. If not, just use fill_missing_dates. I recognize that it's a lot of wasted space when you have a 15min- interval series for example, has you end up with a LOT of missing data. Keep in mind that the package was initially designed for Matt's issues and mine, and we both usually work with daily frequencies or lower (monthly...).
@Pierre & Matt. Please don't my earlier emails as criticism about the documentation. I am extremely thankful that you have taken the time to develop this package. Seconding Tim, I would like to contribute examples/howto's based on the work I'm doing. If you have any guidance on how the best way to do this is that would be great.
Oh, don't worry, we don't take it personnally. We'd be delighted to have some help with the documentation: it's always difficult to put oneself back in the shoes of a newbie when one has been working with a package for a while. Tutorial and how-tos would be great indeed. I'll give you the same answer as to Tim: just drop us a line with your material, we'll find a way to put it on the SVN and the online doc. Thanks again for your support!
![](https://secure.gravatar.com/avatar/56b215661867f3b4f4a3b28077de66b3.jpg?s=120&d=mm&r=g)
On Dec 3, 2008, at 4:18 PM, Tim Michelsen wrote:
A special praise shall be given to the developers Pierre & Matt. By patiently answering my "advanced python newbie" questions they really help me to get the maximum of the numpy.ma and scikits.timeseries tool box. This did really bring my numerical python coding forward.
Wow, thanks a lot ! I should fwd this message to the-one-of-my-bosses- who-has-the-money. He'll probably complain that I don't write enough papers... But still, I really appreciate. Thanks again.
@Pierre, Are there plans to include timeseries into the scipy online doc editor? What for do you suggest if I would like to contribute examples here and there?
Well, Matt and I have been considering making a first official release for a while, but we keep postponing it (I'm the one to blame). Hopefully we should be ready for early 2009 (a mere 5 weeks away). Then, we'll see how we can get incorporate the scikits in scipy, or at least get the docs in the scipy online doc editor. Until then, the easiest is to contact either Matt and I offlist, so that we can take your comments into account.
participants (3)
-
Dharhas Pothina
-
Pierre GM
-
Tim Michelsen