[scikit-learn] Any recommend way to encode IP address?
lampahome
pahome.chen at mirlab.org
Fri Aug 16 03:45:42 EDT 2019
I collect data which has many access log from different IP.
But I don't know what's the better way to encode it to make sure small size
of train data and keep the independency of different IPs.
1. one-hot encode: If too many IP, the train data will occupy huge disk
spaces.
2. category encode: IP will be encoded to 0~N, but can't show the relation
between different IPs.
anyone have advices?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190816/2b35bb8b/attachment.html>
More information about the scikit-learn
mailing list