Yes, it is an efficient method, still, we need to specify the number of clusters or the threshold. Is there another way to run hierarchy clustering on the big dataset? The main problem is the distance matrix. 
Thanks. 

On Tue, Jan 2, 2018 at 6:02 AM, Olivier Grisel <olivier.grisel@ensta.org> wrote:
Have you had a look at BIRCH?

http://scikit-learn.org/stable/modules/clustering.html#birch

--
Olivier

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn