[scikit-learn] Question about Kmeans implementation in sklearn

Chris Aridas chris at aridas.eu
Mon Aug 5 14:40:15 EDT 2019


Hey Serafim,

In this line
https://github.com/scikit-learn/scikit-learn/blob/1495f69242646d239d89a5713982946b8ffcf9d9/sklearn/cluster/k_means_.py#L303
you can see that a randomstate object is constructed and that object is
passed in the for loop that you are referring to, not the integer value
that was passed in the function.

Cheers,
Chris

On Mon, 5 Aug 2019 20:58 serafim loukas, <seralouk at hotmail.com> wrote:

> Dear Sklearn community,
>
>
> I have a simple question concerning the implementation of KMeans
> clustering algorithm.
> Two of the input arguments are the “n_init” and “random_state”.
>
> Consider a case where  *“n_init=10” and “random_state=0”.*
>
> By looking at the source code (
> https://github.com/scikit-learn/scikit-learn/blob/1495f69242646d239d89a5713982946b8ffcf9d9/sklearn/cluster/k_means_.py#L187),
> we have the following:
>
> for it in range(n_init):
> # run a k-means once
> labels, inertia, centers, n_iter_ = kmeans_single(
> X, sample_weight, n_clusters, max_iter=max_iter, init=init,
> verbose=verbose, precompute_distances=precompute_distances,
> tol=tol, x_squared_norms=x_squared_norms,
> random_state=random_state)
>
>
> My question is: Why the results are not going to be the same for all
> `n_init` iterations since `random_state` is fixed?
>
>
> Bests,
> Makis
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190805/bee23c32/attachment-0001.html>


More information about the scikit-learn mailing list