Any way to pre-calculate number of cluster roughly?
I see many ways like elbow method, silhouette score, they all define the cluster number after clustering. Especially the elbow method, I need to monitor the relation with cluster number and find the elbow. But if the dataset is too huge to let me find the elbow and I don't even how many cluster number actually. Any way to pre-calculate number of cluster roughly? thx
A common rule of thumb is number of clusters = sqrt(number of items/2) http://www.ijarcsms.com/docs/paper/volume1/issue6/V1I6-0015.pdf On Wed, 26 Jun 2019 at 12:32, lampahome <pahome.chen@mirlab.org> wrote:
I see many ways like elbow method, silhouette score, they all define the cluster number after clustering.
Especially the elbow method, I need to monitor the relation with cluster number and find the elbow.
But if the dataset is too huge to let me find the elbow and I don't even how many cluster number actually.
Any way to pre-calculate number of cluster roughly?
thx _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Jamie Bull <jamie.bull@oco-carbon.com> 於 2019年6月26日 週三 下午11:02寫道:
A common rule of thumb is number of clusters = sqrt(number of items/2) http://www.ijarcsms.com/docs/paper/volume1/issue6/V1I6-0015.pdf
If I found it the number is too much, how to merge those groups? Calculate each silhouette score of groups or else?
participants (2)
-
Jamie Bull -
lampahome