[scikit-learn] Github project management tools

Joel Nothman joel.nothman at gmail.com
Fri Sep 16 01:14:44 EDT 2016


I think we're quite close to the intended users of Github, they just
started simple and with all these more feature-complete competitors appear,
are adding those features but haven't quite got it right yet. I'm not
convinced that it's the perfect tool (although I haven't seen this
threading problem; gmail seems to still be keeping one thread per PR?), but
its simplicity and familiarity/popularity is a great advantage for handling
new contributors. In terms of contributor familiarity, most of the projects
that we integrate with use same: numpy, scipy, cython (recently), pandas,
matplotlib, ipython. While I appreciate that we are somewhat arbitrarily
supporting a near-monopoly, the case for moving away from, or even
wrapping, github seems poor to me.

Apart from distinguishing between possible bug, actual bug and other (which
are fairly static categories), classifying issues by status is too hard to
manage. What I'd like to suggest is that we choose a way to highlight
high-priority issues for the next release, either through the milestone
feature, the project feature. Other issues will still get attention by way
of random traffic, but we care less about the timing of their resolution.

(I'm sure there must be a way using the API to find issues linked to by PRs
or not, but I don't think that's available in the UI.)

On 16 September 2016 at 09:35, Andreas Mueller <t3kcit at gmail.com> wrote:

> Hey Joel.
> Thanks for bringing this up. I have a really hard time keeping up with
> what's happening
> on the issue tracker and I have no idea how you manage.
>
> The current tags are certainly not always helpful. Also, they are rarely
> updated.
>
> I have been very frustrated by github. I used email to track all issues,
> but their new "upgrade"
> made that impossible as issues are no longer email threads - each review
> is it's own thread.
>
> It might make sense to switch to something like reviewable or gerrit.
> These sit on top of github, and people can interact with them without
> using them.
> I haven't really worked with either, but heard only good things about them.
>
> Any way to prioritize issues and putting them into the buckets that you
> listed would be a great step forward.
> That would require someone manually going through 470 PRs and 762 issues,
> though.
> I would be happy to do that if we actually stick to the system. A single
> person is not enough to keep the tags (or whatever we end up using)
> up to date, though.
>
> Your statuses only apply to PRs, too, and we need to have something
> similar for issues, which have maybe these statuses
>
> * random idea / feature request
> * feature request with consensus to implement
> * possible bug
> * confirmed bug
> * feature request or bug with active PR
> * feature request or bug with stale PR
>
> One problem with these is that man feature requests never get any
> comments, similar for PRs.
> Is a PR without comment waiting for review? Or in dispute?
> A PR could be reviewed but dispute could happen later, as we don't always
> agree on what to do.
>
> I agree that we should try to organize ourselves better. I'm doubtful the
> new github features will help.
> They certainly already have tremendously hindered me in keeping up in the
> couple of hours they've been online.
>
> There is still no way to mark a comment as addressed, and comments are
> still more or less randomly hidden
> (and links to them become dead). Both of these issues are fixed in the
> other review platforms.
>
> I don't think we are the intended users of github, though I'm not sure who
> is.
>
>
>
> On 09/15/2016 07:14 PM, Joel Nothman wrote:
>
> One of the biggest issues with scikit-learn as a project is managing its
> backlog of issues; another is release scheduling. Some of this cannot be
> fixed as long as our model of voluntary contribution (with a couple of
> important exceptions) does not change. However, it may be worth considering
> the new project management features in Github.
>
> At the moment we have the following management:
> * labels corresponding to type (bug, enhancement, new feat, question),
> scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy,
> moderate), status/scheduling (needs contributor, needs review, sprint).
> * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2]
>
> Firstly, we might benefit from prefixing labels by category, i.e.
> difficulty:easy so that complementary labels appear together.
>
> In truth, PRs have roughly these statuses:
> * WIP (not ready for review)
> * waiting for review
> * waiting for changes (with or without one of the following)
> * in dispute (i.e. fundamental doubts about the PR)
> * the above together with 1 or 2 "official" approvals
> * ready for merge (pending minor changes such as what's new documentation)
>
> New github features:
>
> * reviews with "approved" or "request changes". A list of approvers can be
> found in the merge/CI panel. We could replace the MRG+1 annotation with
> this and use it to track disputation too. I'm not sure how it works with
> changes that are added after approval. I think it would have avoided one
> improper merge by me... One downside is that there does not yet seem to be
> a way to search for PRs with a specified level of approval (while searching
> for "MRG+1" sort-of works).
> * Milestone prioritising: issues in a milestone, such as
> https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked
> with drag-and-drop. I think this could help with release scheduling as it
> would allow us to identify the top priorities for a release and see when
> enough of them are completed.
> * The Kanban-style workflow management of the new Projects tool
> https://github.com/scikit-learn/scikit-learn/projects is another way of
> managing status and, I think, priority, for a small set of related issues.
> This might be an alternative way of managing milestone scope, or of working
> towards big changes like the one just completed for model selection; like
> proposed expansions to get_feature_names expansion; like estimator tags;
> making utilities public/private...
>
> So with the goal of making it easier to track where attention is most
> needed, and when to move to release: What's worth trying?
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160916/ed8d272c/attachment-0001.html>


More information about the scikit-learn mailing list