[scikit-learn] Contribution - Markov Clustering

Olivier Grisel olivier.grisel at ensta.org
Tue Jul 11 17:04:14 EDT 2017


If this is the first time you contribute, please make sure to
carefully read the contributors guide till the end:

http://scikit-learn.org/stable/developers/contributing.html

In particular, make sure to follow the estimators API conventions for
your PR to get a chance to be reviewed. In particular the gist you
linked to is not compatible with the scikit-learn estimators API.

Personally I have never heard of Markov clustering, so it's hard for
me to assess whether it should be included in the project or not. It
would really help if you could demonstrate its performance on a
publicly available dataset where is does significantly better than all
the other clustering algorithms already implemented in scikit-learn
(both in terms of training speed and in terms of cluster quality /
stability, although this latter point is very domain dependent).

As a side note, if this is the first time you contribute to the
project, it's probably best to have a look at how other pull requests
are being reviewed (by reading the comment threads of other PRs) and
maybe start by a small pull request to fix small bug (with a
non-regression test) or tackle some documentation issues. Adding new
estimators takes a lot of effort to review (we need tests, docs,
updated examples) and assume some familiarity with the existing code
base.

-- 
Olivier


More information about the scikit-learn mailing list