[scikit-learn] pipeline for modifying target and number of samples

David Burns david.mo.burns at gmail.com
Thu Aug 2 01:25:54 EDT 2018


Hi,

I posted a while back about this, and am reposting now since I have made 
progress on this topic. As you are probably aware, the sklearn Pipeline 
only supports transformers for X, and the number of samples must stay 
the same.

I work with time series where the learning pipeline relies on 
transformations like resampling, segmentation, etc that change the 
target and number of samples in the data set. In order to address this, 
I created an sklearn compatible pipeline that handles transformers that 
alter X, y, and sample_weight together. It can undergo model selection 
using the sklearn tools, and integrates with all the sklearn 
transformers and estimators. It also has some new options for setting 
hyper-parameters with callables and in reference to other parameters.

The implementation is in my time series package seglearn:

https://github.com/dmbee/seglearn

- Best

David Burns




More information about the scikit-learn mailing list