Getting the indexes of the data points after clustering using Kmeans
Hi, I have applied Kmeans clustering using the scikit library from kmeans=KMeans(max_iter=4,n_clusters=10,n_init=10).fit(euclidean_dist) After applying the algorithm.I would like to get the data points in the clusters so as to further use them to apply a model. Example: kmeans.cluster_centers_[1] gives me distance array of all the data points. Is there any way around this available in scikit so as to get the data points id/index. Regards
Hi, if you have your original points stored in a numpy array, you can get all points from a cluster i by doing the following: cluster_points = points[kmeans.labels_ == i] "kmeans.labels_" contains a list labels for each point. "kmeans.labels_ == i" creates a mask that selects only those points that belong to cluster i and the whole line then gives you the points, finally. BTW: the fit method has the raw points as input parameter, not the distance matrix. Regards, Christian prince gosavi <princegosavi12@gmail.com> schrieb am Mi., 21. Feb. 2018 um 11:16 Uhr:
Hi, I have applied Kmeans clustering using the scikit library from
kmeans=KMeans(max_iter=4,n_clusters=10,n_init=10).fit(euclidean_dist)
After applying the algorithm.I would like to get the data points in the clusters so as to further use them to apply a model.
Example: kmeans.cluster_centers_[1]
gives me distance array of all the data points.
Is there any way around this available in scikit so as to get the data points id/index.
Regards _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hi, Thanks for your hint It just saved my day. Regards, Rajkumar On Wed, Feb 21, 2018 at 4:28 PM, Christian Braune < christian.braune79@gmail.com> wrote:
Hi,
if you have your original points stored in a numpy array, you can get all points from a cluster i by doing the following:
cluster_points = points[kmeans.labels_ == i]
"kmeans.labels_" contains a list labels for each point. "kmeans.labels_ == i" creates a mask that selects only those points that belong to cluster i and the whole line then gives you the points, finally.
BTW: the fit method has the raw points as input parameter, not the distance matrix.
Regards, Christian
prince gosavi <princegosavi12@gmail.com> schrieb am Mi., 21. Feb. 2018 um 11:16 Uhr:
Hi, I have applied Kmeans clustering using the scikit library from
kmeans=KMeans(max_iter=4,n_clusters=10,n_init=10).fit(euclidean_dist)
After applying the algorithm.I would like to get the data points in the clusters so as to further use them to apply a model.
Example: kmeans.cluster_centers_[1]
gives me distance array of all the data points.
Is there any way around this available in scikit so as to get the data points id/index.
Regards _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- Regards
participants (2)
-
Christian Braune -
prince gosavi