[scikit-learn] A basic question about kmeans algorithms elkan and llyod
Andreas Mueller
t3kcit at gmail.com
Fri Mar 27 12:36:52 EDT 2020
There's an interesting analysis in this paper:
Fast K-Means with Accurate Bounds
http://proceedings.mlr.press/v48/newling16.pdf
On 3/26/20 3:40 AM, Alexandre Gramfort wrote:
> hi,
>
> I suspect Elkan is really winning when you have many centroids
> so the conclusion is not systematic
>
> my 2c
> Alex
>
>
> On Thu, Mar 26, 2020 at 3:18 AM MC_George123 at hotmail.com
> <mailto:MC_George123 at hotmail.com> <MC_George123 at hotmail.com
> <mailto:MC_George123 at hotmail.com>> wrote:
>
> Hi admins,
>
> My team is working on optimization on scikit-learn staff now. When
> it comes to kmeans, I find there are two algorithms, one of which
> is lloyd and the other is elkan, which is the optimized one for
> lloyd using triangle inequality. In the older version of
> scikit-learn, elkan only supports dense dataset instead of sparse
> one. And in the latest version, elkan supports both type of
> datasets. So there is a question why both two algorithms are kept
> in kmeans since they do the almost same thing and elkan is a
> optimized one for lloyd. Are there any precision difference
> between two algorithms and how can I decide what algorithm to use?
>
> Best regards,
>
> George Fan
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org <mailto:scikit-learn at python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200327/fc952465/attachment.html>
More information about the scikit-learn
mailing list