[scikit-learn] A basic question about kmeans algorithms elkan and llyod

Alexandre Gramfort alexandre.gramfort at inria.fr
Thu Mar 26 03:40:15 EDT 2020


hi,

I suspect Elkan is really winning when you have many centroids
so the conclusion is not systematic

my 2c
Alex


On Thu, Mar 26, 2020 at 3:18 AM MC_George123 at hotmail.com <
MC_George123 at hotmail.com> wrote:

> Hi admins,
>
>
>
> My team is working on optimization on scikit-learn staff now. When it
> comes to kmeans, I find there are two algorithms, one of which is lloyd and
> the other is elkan, which is the optimized one for lloyd using triangle
> inequality.  In the older version of scikit-learn, elkan only supports
> dense dataset instead of sparse one. And in the latest version, elkan
> supports both type of datasets. So there is a question why both two
> algorithms are kept in kmeans since they do the almost same thing and elkan
> is a optimized one for lloyd. Are there any precision difference between
> two algorithms and how can I decide what algorithm to use?
>
>
>
> Best regards,
>
> George Fan
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200326/c649667e/attachment.html>


More information about the scikit-learn mailing list