[scikit-learn] [Feature] drop_one in one hot encoder

Gael Varoquaux gael.varoquaux at normalesup.org
Sun Jun 25 13:01:10 EDT 2017


On Sun, Jun 25, 2017 at 05:18:09PM +0530, Parminder Singh wrote:
> Hy Sci-kittens! :-)

Nice :).

FYI: there is work in progress to replace the OneHotEncoder, as it has
many strong limitations:
https://github.com/scikit-learn/scikit-learn/pull/9151

It might be useful to have a look at this PR to make sure that it solves
the various use cases.

Gaël

> I was doing machine learning a-z course on Udemy, there they told that
> every time one-hot encoding is done, one of the columns should be dropped
> as it is like doubling same category twice and redundant to model. I
> thought if instead of having user find the index and drop it after
> preprocessing, OneHotEncoder had a drop_one variable, and it automatically
> removed the last column. What are your thoughts about this? I am new to
> this community, would like to contribute this myself if it is possible
> addition.

> Thanks,
> Trion129
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux


More information about the scikit-learn mailing list