[scikit-learn] Three new scikit-learn-contrib projects

Gael Varoquaux gael.varoquaux at normalesup.org
Wed Jul 20 03:48:43 EDT 2016


Hey,

These packages look great! I was interested in the imbalanced learning,
which is something that we stumbled upon:

> * imbalanced-learn: https://github.com/scikit-learn-contrib/imbalanced-learn

> Python module to perform under sampling and over sampling with various
> techniques.

Interestingly, the fit_sample method is related to the scikit-learn
enhancement proposal that we have tried to put together objects that can
modify the y in addition to the X:
https://github.com/scikit-learn/enhancement_proposals/pull/2

I think that this enhancement proposal of our API is important for two
reasons. The first one is that the corresponding objects cannot be put in
a pipeline (imbalanced-learn ends up having it's own pipeline), and hence
cannot benefit from hyper-parameter tuning on the full set of steps, or
cool things like DaskLearn. The second one is that different projects are
likely to come up with similar but incompatible solutions to this
problem, making it harder to combine things.

Unfortunately, I haven't had time to push forward this proposal. But
comments on it (or a pull request to it) would be awesome.

Cheers,

Gaël


More information about the scikit-learn mailing list