[scikit-learn] Adding Quickshift clustering algorithm

Heinrich Jiang heinrich.jiang at gmail.com
Mon Oct 8 00:02:37 EDT 2018

I'm a researcher at Google Research and I am writing to initiate discussion
about adding Quickshift as well as a variant of it as part of
scikit-learn's set of clustering algorithms.

This somewhat recent algorithm was designed as a faster alternative to Mean
Shift and has been used extensively in computer vision (and already part of
scikit-image). The method was published independently in these papers
[1,2]. [1] has 600 citations and [2] has 1300 citations.

[1] Vedaldi, Andrea, and Stefano Soatto. "Quick shift and kernel methods
for mode seeking." *European Conference on Computer Vision*. Springer,
Berlin, Heidelberg, 2008.
[2] Rodriguez, Alex, and Alessandro Laio. "Clustering by fast search and
find of density peaks." *Science* 344.6191 (2014): 1492-1496.

In addition to Quickshift, I also propose a variant called Quickshift++,
which is Quickshift with an additional hyperparameter. We showed in [3]
that this substantially improved performance over Quickshift as well as
other clustering algorithms implemented in sklearn on benchmark datasets.
(i.e. Figure 9 in https://arxiv.org/abs/1805.07909) and was published at
ICML 2018.

[3] Jiang, Heinrich, Jennifer Jang, and Samory Kpotufe. "Quickshift++:
Provably Good Initializations for Sample-Based Mean Shift." ICML 2018

We have an implementation here (https://github.com/google/quickshift).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20181007/104a5a7f/attachment.html>

More information about the scikit-learn mailing list