[scikit-learn] Contribution to sklearn: Cross validation of time series

andres lago a_lago at hotmail.com
Fri Apr 28 12:26:56 EDT 2017


Hi Andy,

  I'll try to be more precise with the CV I'm proposing. Comparing to the actual TimeSeriesSplit, these would be the new parameters:

   -Rolling window Or Variable length window:

Rolling window mode keeps the CV-training set with the same size for all folds, shifting forward at each iteration of CV. Variable  length window mode increments the size of CV-training set at each fold iteration (actual implementation in TimeSeriesSplit)




________________________________
De: scikit-learn <scikit-learn-bounces+a_lago=hotmail.com at python.org> en nombre de Andreas Mueller <t3kcit at gmail.com>
Enviado: viernes, 28 de abril de 2017 05:48 p. m.
Para: Scikit-learn user and developer mailing list
Asunto: Re: [scikit-learn] Contribution to sklearn: Cross validation of time series

Hey Andres.
I think there might be a PR for that.
Can you explain the minimum size of the training set? How is that used?
I thought the other main option would be "rolling window" cross validation
to use a fixed length cv training set.

So the two options to me were rolling window and what we're doing right now.
Can you elaborate on the other use cases, like minimum size of the training set
and why you would want the other options with a variable length training set?

Thanks,
Andy

On 04/27/2017 09:44 AM, andres lago wrote:

Hello,

  I'd like to contribute with a new functionality in sklearn. It's the cross validation of time series. It's an evolution of the current functionality, implemented by TimeSeriesSplit.


  TimeSeriesSplit only allows the user to set the number of folds. In real life, when performing the cross validation of time series, other parameters are required, for instance:

    -minimum size of CV-training set

    -size of CV-test set

    -fixed or variable length of CV-training set.


  The functionality is inspired by the R library 'caret'.


  If you agree, I can share my code. I developed it for a project with the french rail company SNCF. It's in production now.


  Regards,

    Andres



_______________________________________________
scikit-learn mailing list
scikit-learn at python.org<mailto:scikit-learn at python.org>
https://mail.python.org/mailman/listinfo/scikit-learn


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170428/37221f5d/attachment.html>


More information about the scikit-learn mailing list