[scikit-learn] Silhouette example - performance issue

Anaël Bonneton anael.bonneton at gmail.com
Fri Oct 14 09:27:01 EDT 2016


Hi,

In the silhouette example (
http://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html#sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py),
the silhouette values of each sample is computed twice: once with
*silhouette_score
*and once with *silhouette_samples.* The call to *silhouette_score* can be
easily avoided by computing the average of the result of*
silhouette_samples*.

Do you think we should remove the call to *silhouette_score* to improve the
performance ? Or it is better to keep the two functions to show how to use
them ?

Anaël Bonneton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161014/f99e838d/attachment.html>


More information about the scikit-learn mailing list