[SciPy-User] Status of TimeSeries SciKit

Matt Knox mattknox.ca at gmail.com
Tue Jul 26 13:58:27 EDT 2011


> >>> I work very actively on time
> >>> series-related functionality in pandas so it might not even be
> >>> unthinkable to merge together the projects (scikits.timeseries and
> >>> pandas) and integrate all the numpy.datetime64 stuff once the dust
> >>> settles there. Just thinking out loud.
> >> 
> >> That's an idea.
> >> 
> > 
> > Any thoughts on the idea? Do you think it's reasonable and/or
> > beneficial? There is also some talk with the scikits.learn and
> > scikits.statsmodels to drop the scikits namespace, which would be
> > better as a collective decision, so the merging could be a part of
> > this? I use both packages now, and I, for one, would love to see them
> > come together and share to the extent this is feasible. Others? I
> > especially like the plotting stuff since it's great but I've had to
> > make a few local patches here and there for mpl changes.
> 
> No surprise for matplotlib. I kinda dropped the ball here (when I need to
> plot stuffs these days, I don't use mpl). I haven't used pandas yet, for the
> same reasons why I wasn't able to keep with updating scikits.timeseries.
> But if y'all use the two in parallel and have a need for porting
> scikits.timeseries to pandas, then go for it, you have my blessing. And you
> know where to contact me if you have some issues or questions. 

I would basically echo Pierre's comments here. I don't have the time (or to
be perfectly honest, the energy and motivation) to maintain the timeseries
module anymore and would definitely be in favor of any efforts to merge its
functionality into a better supported module.

It's clear at this point that the timeseries module in its current form is a
dead end given the lack of maintainers as well as the fundamental building
blocks which are coming into place that would allow a better timeseries module.
Those building blocks being:
    
    1. datetime data type support in numpy
    2. improved missing value support in numpy
    3. data array / labelled array / pandas type of stuff which should (in
       theory) simplify indexing a timeseries with dates relative to the large
       hacks used in the current timeseries module

In many ways, the timeseries module is a giant hack which tries to work around
the fact that it is missing these key foundational pieces in numpy.

If pandas is the module that unifies all these concepts into a cohesive
package, then I think that is fantastic!

And from lurking on the numpy and scipy mailing lists and monitoring all the
threads on the related topics recently, I feel confident that I have little to
contribute and that the problem rests in much more capable hands than my own :)

- Matt Knox





More information about the SciPy-User mailing list