[scikit-learn] Agglomerative clustering

Gael Varoquaux gael.varoquaux at normalesup.org
Mon Dec 10 00:58:52 EST 2018


> I want to impose an additional constraint. When 2 clusters are combined and the
> cost of combination is equal for multiple cluster pairs, I want to choose the
> pair for which the combined cluster has the least size.

> What is the cleanest and easiest way of achieving this?

I don't think that the public API enables you to do that. So I think that
you are going to have to modify the code, and modify the cost heapq to
make it a tuple of "(distance, size)".

Unfortunately, when doing this, you'll be on your own, as we cannot
provide support for modified code.

Cheers,

Gaël


More information about the scikit-learn mailing list