Hello all, In KMeans cluster, there is a parameter n_init. It shows that the algorithm will run n_init times and output the best. I wonder how to compare the output of each run. Can we get the score for each run? Thanks.
Yes, but what is used to decide the optimal output? I saw on the document, it is the best output in terms of inertia. What does that mean? Thanks. On Wed, Feb 14, 2018 at 7:46 PM, Joel Nothman <joel.nothman@gmail.com> wrote:
you can repeatedly use n_init=1?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Inertia simply means the sum of the squared distances from sample points to their cluster centroid. The smaller the inertia, the closer the cluster members are to their cluster centroid (that's also what KMeans optimizes when choosing centroids). In this context, the elbow method may be helpful (https://bl.ocks.org/rpgove/raw/0060ff3b656618e9136b/9aee23cc799d154520572b30...) Maybe also take a look at the silhouette metric for choosing K: http://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_... Best, Sebastian
On Feb 20, 2018, at 5:14 PM, Shiheng Duan <shiduan@ucdavis.edu> wrote:
Yes, but what is used to decide the optimal output? I saw on the document, it is the best output in terms of inertia. What does that mean? Thanks.
On Wed, Feb 14, 2018 at 7:46 PM, Joel Nothman <joel.nothman@gmail.com> wrote: you can repeatedly use n_init=1?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (3)
-
Joel Nothman -
Sebastian Raschka -
Shiheng Duan