[scikit-learn] Any recommend way to encode IP address?
pahome.chen at mirlab.org
Fri Aug 16 03:45:42 EDT 2019
I collect data which has many access log from different IP.
But I don't know what's the better way to encode it to make sure small size
of train data and keep the independency of different IPs.
1. one-hot encode: If too many IP, the train data will occupy huge disk
2. category encode: IP will be encoded to 0~N, but can't show the relation
between different IPs.
anyone have advices?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the scikit-learn