[scikit-learn] Can I evaluate clustering efficiency incrementally?

lampahome pahome.chen at mirlab.org
Mon May 13 22:10:22 EDT 2019


Uri Goren <ugoren at gmail.com> 於 2019年5月3日 週五 下午7:29寫道:

> I usually use clustering to save costs on labelling.
> I like to apply hierarchical clustering, and then label a small sample and
> fine-tune the clustering algorithm.
>
> That way, you can evaluate the effectiveness in terms of cluster purity
> (how many clusters contain mixed labels)
>
> See example with sklearn here :
> https://youtu.be/GM8L324MuHc?list=PLqkckaeDLF4IDdKltyBwx8jLaz5nwDPQU
>
>
> But if my dataset is too large to load into memory, will it work?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190514/f8343113/attachment.html>


More information about the scikit-learn mailing list