[scikit-learn] One-hot encoding
Sarah Wait Zaranek
sarah.zaranek at gmail.com
Sun Feb 4 23:10:55 EST 2018
I was just wondering if there was a way to improve performance on the
one-hot encoder. Or, is there any plans to do so in the future? I am
working with a matrix that will ultimately have 20 million categorical
variables, and my bottleneck is the one-hot encoder.
Let me know if this isn't the place to inquire. My code is very simple
when using the encoder, but I cut and pasted it here for completeness.
enc = OneHotEncoder(sparse=True)
Xtrain = enc.fit_transform(tiledata)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the scikit-learn