A common rule of thumb is number of clusters = sqrt(number of items/2)  http://www.ijarcsms.com/docs/paper/volume1/issue6/V1I6-0015.pdf

On Wed, 26 Jun 2019 at 12:32, lampahome <pahome.chen@mirlab.org> wrote:
I see many ways like elbow method, silhouette score, they all define the cluster number after clustering.

Especially the elbow method, I need to monitor the relation with cluster number and find the elbow.

But if the dataset is too huge to let me find the elbow and I don't even how many cluster number actually.

Any way to pre-calculate number of cluster roughly?

thx
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn