[scikit-learn] Any recommend way to encode IP address?

Chris Aridas chris at aridas.eu
Fri Aug 16 03:54:27 EDT 2019


Apart from encoding you could use feature engineering. Something like this
Two IPs might have the same country but different city. So, you could mix
and match whatever you want.


On Fri, Aug 16, 2019 at 10:46 AM lampahome <pahome.chen at mirlab.org> wrote:

> I collect data which has many access log from different IP.
> But I don't know what's the better way to encode it to make sure small
> size of train data and keep the independency of different IPs.
> 1. one-hot encode: If too many IP, the train data will occupy huge disk
> spaces.
> 2. category encode: IP will be encoded to 0~N, but can't show the relation
> between different IPs.
> anyone have advices?
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190816/99aa2d5d/attachment.html>

More information about the scikit-learn mailing list