[scikit-learn] Silhouette example - performance issue

Michael Eickenberg michael.eickenberg at gmail.com
Fri Oct 14 09:55:25 EDT 2016

Dear Anaël,

if you wish, you could add a line to the example verifying this
correspondence. E.g. by moving the print function from between the two
silhouette evaluations to after and also evaluating that average and
printing it in parentheses.

Probably not necessary though. A comment would do also. Or nothing :)


On Fri, Oct 14, 2016 at 3:38 PM, Raghav R V <ragvrv at gmail.com> wrote:

> On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton <anael.bonneton at gmail.com>
> wrote:
>> Hi,
>> In the silhouette example (http://scikit-learn.org/stabl
>> e/auto_examples/cluster/plot_kmeans_silhouette_analysis.
>> html#sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py),
>> the silhouette values of each sample is computed twice: once with *silhouette_score
>> *and once with *silhouette_samples.* The call to *silhouette_score* can
>> be easily avoided by computing the average of the result of*
>> silhouette_samples*.
>> Do you think we should remove the call to *silhouette_score* to improve
>> the performance ? Or it is better to keep the two functions to show how to
>> use them ?
> Hi,
> When I wrote it, I intended it to be demonstrative of the two methods.
> Not sure if we should worry about performance issues there
> --
> Raghav RV
> https://github.com/raghavrv
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161014/94c5777a/attachment.html>

More information about the scikit-learn mailing list