Github project management tools
One of the biggest issues with scikit-learn as a project is managing its backlog of issues; another is release scheduling. Some of this cannot be fixed as long as our model of voluntary contribution (with a couple of important exceptions) does not change. However, it may be worth considering the new project management features in Github. At the moment we have the following management: * labels corresponding to type (bug, enhancement, new feat, question), scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), status/scheduling (needs contributor, needs review, sprint). * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2] Firstly, we might benefit from prefixing labels by category, i.e. difficulty:easy so that complementary labels appear together. In truth, PRs have roughly these statuses: * WIP (not ready for review) * waiting for review * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation) New github features: * reviews with "approved" or "request changes". A list of approvers can be found in the merge/CI panel. We could replace the MRG+1 annotation with this and use it to track disputation too. I'm not sure how it works with changes that are added after approval. I think it would have avoided one improper merge by me... One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works). * Milestone prioritising: issues in a milestone, such as https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with drag-and-drop. I think this could help with release scheduling as it would allow us to identify the top priorities for a release and see when enough of them are completed. * The Kanban-style workflow management of the new Projects tool https://github.com/scikit-learn/scikit-learn/projects is another way of managing status and, I think, priority, for a small set of related issues. This might be an alternative way of managing milestone scope, or of working towards big changes like the one just completed for model selection; like proposed expansions to get_feature_names expansion; like estimator tags; making utilities public/private... So with the goal of making it easier to track where attention is most needed, and when to move to release: What's worth trying?
Hey Joel. Thanks for bringing this up. I have a really hard time keeping up with what's happening on the issue tracker and I have no idea how you manage. The current tags are certainly not always helpful. Also, they are rarely updated. I have been very frustrated by github. I used email to track all issues, but their new "upgrade" made that impossible as issues are no longer email threads - each review is it's own thread. It might make sense to switch to something like reviewable or gerrit. These sit on top of github, and people can interact with them without using them. I haven't really worked with either, but heard only good things about them. Any way to prioritize issues and putting them into the buckets that you listed would be a great step forward. That would require someone manually going through 470 PRs and 762 issues, though. I would be happy to do that if we actually stick to the system. A single person is not enough to keep the tags (or whatever we end up using) up to date, though. Your statuses only apply to PRs, too, and we need to have something similar for issues, which have maybe these statuses * random idea / feature request * feature request with consensus to implement * possible bug * confirmed bug * feature request or bug with active PR * feature request or bug with stale PR One problem with these is that man feature requests never get any comments, similar for PRs. Is a PR without comment waiting for review? Or in dispute? A PR could be reviewed but dispute could happen later, as we don't always agree on what to do. I agree that we should try to organize ourselves better. I'm doubtful the new github features will help. They certainly already have tremendously hindered me in keeping up in the couple of hours they've been online. There is still no way to mark a comment as addressed, and comments are still more or less randomly hidden (and links to them become dead). Both of these issues are fixed in the other review platforms. I don't think we are the intended users of github, though I'm not sure who is. On 09/15/2016 07:14 PM, Joel Nothman wrote:
One of the biggest issues with scikit-learn as a project is managing its backlog of issues; another is release scheduling. Some of this cannot be fixed as long as our model of voluntary contribution (with a couple of important exceptions) does not change. However, it may be worth considering the new project management features in Github.
At the moment we have the following management: * labels corresponding to type (bug, enhancement, new feat, question), scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), status/scheduling (needs contributor, needs review, sprint). * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2]
Firstly, we might benefit from prefixing labels by category, i.e. difficulty:easy so that complementary labels appear together.
In truth, PRs have roughly these statuses: * WIP (not ready for review) * waiting for review * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation)
New github features:
* reviews with "approved" or "request changes". A list of approvers can be found in the merge/CI panel. We could replace the MRG+1 annotation with this and use it to track disputation too. I'm not sure how it works with changes that are added after approval. I think it would have avoided one improper merge by me... One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works). * Milestone prioritising: issues in a milestone, such as https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with drag-and-drop. I think this could help with release scheduling as it would allow us to identify the top priorities for a release and see when enough of them are completed. * The Kanban-style workflow management of the new Projects tool https://github.com/scikit-learn/scikit-learn/projects is another way of managing status and, I think, priority, for a small set of related issues. This might be an alternative way of managing milestone scope, or of working towards big changes like the one just completed for model selection; like proposed expansions to get_feature_names expansion; like estimator tags; making utilities public/private...
So with the goal of making it easier to track where attention is most needed, and when to move to release: What's worth trying?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
I think we're quite close to the intended users of Github, they just started simple and with all these more feature-complete competitors appear, are adding those features but haven't quite got it right yet. I'm not convinced that it's the perfect tool (although I haven't seen this threading problem; gmail seems to still be keeping one thread per PR?), but its simplicity and familiarity/popularity is a great advantage for handling new contributors. In terms of contributor familiarity, most of the projects that we integrate with use same: numpy, scipy, cython (recently), pandas, matplotlib, ipython. While I appreciate that we are somewhat arbitrarily supporting a near-monopoly, the case for moving away from, or even wrapping, github seems poor to me. Apart from distinguishing between possible bug, actual bug and other (which are fairly static categories), classifying issues by status is too hard to manage. What I'd like to suggest is that we choose a way to highlight high-priority issues for the next release, either through the milestone feature, the project feature. Other issues will still get attention by way of random traffic, but we care less about the timing of their resolution. (I'm sure there must be a way using the API to find issues linked to by PRs or not, but I don't think that's available in the UI.) On 16 September 2016 at 09:35, Andreas Mueller <t3kcit@gmail.com> wrote:
Hey Joel. Thanks for bringing this up. I have a really hard time keeping up with what's happening on the issue tracker and I have no idea how you manage.
The current tags are certainly not always helpful. Also, they are rarely updated.
I have been very frustrated by github. I used email to track all issues, but their new "upgrade" made that impossible as issues are no longer email threads - each review is it's own thread.
It might make sense to switch to something like reviewable or gerrit. These sit on top of github, and people can interact with them without using them. I haven't really worked with either, but heard only good things about them.
Any way to prioritize issues and putting them into the buckets that you listed would be a great step forward. That would require someone manually going through 470 PRs and 762 issues, though. I would be happy to do that if we actually stick to the system. A single person is not enough to keep the tags (or whatever we end up using) up to date, though.
Your statuses only apply to PRs, too, and we need to have something similar for issues, which have maybe these statuses
* random idea / feature request * feature request with consensus to implement * possible bug * confirmed bug * feature request or bug with active PR * feature request or bug with stale PR
One problem with these is that man feature requests never get any comments, similar for PRs. Is a PR without comment waiting for review? Or in dispute? A PR could be reviewed but dispute could happen later, as we don't always agree on what to do.
I agree that we should try to organize ourselves better. I'm doubtful the new github features will help. They certainly already have tremendously hindered me in keeping up in the couple of hours they've been online.
There is still no way to mark a comment as addressed, and comments are still more or less randomly hidden (and links to them become dead). Both of these issues are fixed in the other review platforms.
I don't think we are the intended users of github, though I'm not sure who is.
On 09/15/2016 07:14 PM, Joel Nothman wrote:
One of the biggest issues with scikit-learn as a project is managing its backlog of issues; another is release scheduling. Some of this cannot be fixed as long as our model of voluntary contribution (with a couple of important exceptions) does not change. However, it may be worth considering the new project management features in Github.
At the moment we have the following management: * labels corresponding to type (bug, enhancement, new feat, question), scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), status/scheduling (needs contributor, needs review, sprint). * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2]
Firstly, we might benefit from prefixing labels by category, i.e. difficulty:easy so that complementary labels appear together.
In truth, PRs have roughly these statuses: * WIP (not ready for review) * waiting for review * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation)
New github features:
* reviews with "approved" or "request changes". A list of approvers can be found in the merge/CI panel. We could replace the MRG+1 annotation with this and use it to track disputation too. I'm not sure how it works with changes that are added after approval. I think it would have avoided one improper merge by me... One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works). * Milestone prioritising: issues in a milestone, such as https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with drag-and-drop. I think this could help with release scheduling as it would allow us to identify the top priorities for a release and see when enough of them are completed. * The Kanban-style workflow management of the new Projects tool https://github.com/scikit-learn/scikit-learn/projects is another way of managing status and, I think, priority, for a small set of related issues. This might be an alternative way of managing milestone scope, or of working towards big changes like the one just completed for model selection; like proposed expansions to get_feature_names expansion; like estimator tags; making utilities public/private...
So with the goal of making it easier to track where attention is most needed, and when to move to release: What's worth trying?
_______________________________________________ scikit-learn mailing listscikit-learn@python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
A form – with required, pre-defined fields – can help when people submit bugs, issues, or requests for new features. Perhaps creating an issue template for scikit-learn is a good first step. https://help.github.com/articles/creating-an-issue-template-for-your-reposit... Pull requests also have a template https://help.github.com/articles/creating-a-pull-request-template-for-your-r... I am not sure how these fit into the team’s review and release workflow. If this doesn’t quite fit your needs, perhaps engaging Github Support will yield something interesting. __________________________________________________________________________________________ Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science 770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.smith@macys.com From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys.com@python.org] On Behalf Of Joel Nothman Sent: Friday, September 16, 2016 1:15 AM To: Scikit-learn user and developer mailing list Subject: Re: [scikit-learn] Github project management tools ⚠ EXT MSG: I think we're quite close to the intended users of Github, they just started simple and with all these more feature-complete competitors appear, are adding those features but haven't quite got it right yet. I'm not convinced that it's the perfect tool (although I haven't seen this threading problem; gmail seems to still be keeping one thread per PR?), but its simplicity and familiarity/popularity is a great advantage for handling new contributors. In terms of contributor familiarity, most of the projects that we integrate with use same: numpy, scipy, cython (recently), pandas, matplotlib, ipython. While I appreciate that we are somewhat arbitrarily supporting a near-monopoly, the case for moving away from, or even wrapping, github seems poor to me. Apart from distinguishing between possible bug, actual bug and other (which are fairly static categories), classifying issues by status is too hard to manage. What I'd like to suggest is that we choose a way to highlight high-priority issues for the next release, either through the milestone feature, the project feature. Other issues will still get attention by way of random traffic, but we care less about the timing of their resolution. (I'm sure there must be a way using the API to find issues linked to by PRs or not, but I don't think that's available in the UI.) On 16 September 2016 at 09:35, Andreas Mueller <t3kcit@gmail.com<mailto:t3kcit@gmail.com>> wrote: Hey Joel. Thanks for bringing this up. I have a really hard time keeping up with what's happening on the issue tracker and I have no idea how you manage. The current tags are certainly not always helpful. Also, they are rarely updated. I have been very frustrated by github. I used email to track all issues, but their new "upgrade" made that impossible as issues are no longer email threads - each review is it's own thread. It might make sense to switch to something like reviewable or gerrit. These sit on top of github, and people can interact with them without using them. I haven't really worked with either, but heard only good things about them. Any way to prioritize issues and putting them into the buckets that you listed would be a great step forward. That would require someone manually going through 470 PRs and 762 issues, though. I would be happy to do that if we actually stick to the system. A single person is not enough to keep the tags (or whatever we end up using) up to date, though. Your statuses only apply to PRs, too, and we need to have something similar for issues, which have maybe these statuses * random idea / feature request * feature request with consensus to implement * possible bug * confirmed bug * feature request or bug with active PR * feature request or bug with stale PR One problem with these is that man feature requests never get any comments, similar for PRs. Is a PR without comment waiting for review? Or in dispute? A PR could be reviewed but dispute could happen later, as we don't always agree on what to do. I agree that we should try to organize ourselves better. I'm doubtful the new github features will help. They certainly already have tremendously hindered me in keeping up in the couple of hours they've been online. There is still no way to mark a comment as addressed, and comments are still more or less randomly hidden (and links to them become dead). Both of these issues are fixed in the other review platforms. I don't think we are the intended users of github, though I'm not sure who is. On 09/15/2016 07:14 PM, Joel Nothman wrote: One of the biggest issues with scikit-learn as a project is managing its backlog of issues; another is release scheduling. Some of this cannot be fixed as long as our model of voluntary contribution (with a couple of important exceptions) does not change. However, it may be worth considering the new project management features in Github. At the moment we have the following management: * labels corresponding to type (bug, enhancement, new feat, question), scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), status/scheduling (needs contributor, needs review, sprint). * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2] Firstly, we might benefit from prefixing labels by category, i.e. difficulty:easy so that complementary labels appear together. In truth, PRs have roughly these statuses: * WIP (not ready for review) * waiting for review * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation) New github features: * reviews with "approved" or "request changes". A list of approvers can be found in the merge/CI panel. We could replace the MRG+1 annotation with this and use it to track disputation too. I'm not sure how it works with changes that are added after approval. I think it would have avoided one improper merge by me... One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works). * Milestone prioritising: issues in a milestone, such as https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with drag-and-drop. I think this could help with release scheduling as it would allow us to identify the top priorities for a release and see when enough of them are completed. * The Kanban-style workflow management of the new Projects tool https://github.com/scikit-learn/scikit-learn/projects is another way of managing status and, I think, priority, for a small set of related issues. This might be an alternative way of managing milestone scope, or of working towards big changes like the one just completed for model selection; like proposed expansions to get_feature_names expansion; like estimator tags; making utilities public/private... So with the goal of making it easier to track where attention is most needed, and when to move to release: What's worth trying? _______________________________________________ scikit-learn mailing list scikit-learn@python.org<mailto:scikit-learn@python.org> https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org<mailto:scikit-learn@python.org> https://mail.python.org/mailman/listinfo/scikit-learn * This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments.
Scikit-learn’s GitHub repo already makes use of these templates. I think the issue is more a technical one arising from their latest “style” changes.
On Sep 16, 2016, at 8:25 AM, Dale T Smith <Dale.T.Smith@macys.com> wrote:
A form – with required, pre-defined fields – can help when people submit bugs, issues, or requests for new features. Perhaps creating an issue template for scikit-learn is a good first step.
https://help.github.com/articles/creating-an-issue-template-for-your-reposit...
Pull requests also have a template
https://help.github.com/articles/creating-a-pull-request-template-for-your-r...
I am not sure how these fit into the team’s review and release workflow.
If this doesn’t quite fit your needs, perhaps engaging Github Support will yield something interesting.
__________________________________________________________________________________________ Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science 770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.smith@macys.com
From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys.com@python.org] On Behalf Of Joel Nothman Sent: Friday, September 16, 2016 1:15 AM To: Scikit-learn user and developer mailing list Subject: Re: [scikit-learn] Github project management tools
⚠ EXT MSG: I think we're quite close to the intended users of Github, they just started simple and with all these more feature-complete competitors appear, are adding those features but haven't quite got it right yet. I'm not convinced that it's the perfect tool (although I haven't seen this threading problem; gmail seems to still be keeping one thread per PR?), but its simplicity and familiarity/popularity is a great advantage for handling new contributors. In terms of contributor familiarity, most of the projects that we integrate with use same: numpy, scipy, cython (recently), pandas, matplotlib, ipython. While I appreciate that we are somewhat arbitrarily supporting a near-monopoly, the case for moving away from, or even wrapping, github seems poor to me.
Apart from distinguishing between possible bug, actual bug and other (which are fairly static categories), classifying issues by status is too hard to manage. What I'd like to suggest is that we choose a way to highlight high-priority issues for the next release, either through the milestone feature, the project feature. Other issues will still get attention by way of random traffic, but we care less about the timing of their resolution.
(I'm sure there must be a way using the API to find issues linked to by PRs or not, but I don't think that's available in the UI.)
On 16 September 2016 at 09:35, Andreas Mueller <t3kcit@gmail.com> wrote: Hey Joel. Thanks for bringing this up. I have a really hard time keeping up with what's happening on the issue tracker and I have no idea how you manage.
The current tags are certainly not always helpful. Also, they are rarely updated.
I have been very frustrated by github. I used email to track all issues, but their new "upgrade" made that impossible as issues are no longer email threads - each review is it's own thread.
It might make sense to switch to something like reviewable or gerrit. These sit on top of github, and people can interact with them without using them. I haven't really worked with either, but heard only good things about them.
Any way to prioritize issues and putting them into the buckets that you listed would be a great step forward. That would require someone manually going through 470 PRs and 762 issues, though. I would be happy to do that if we actually stick to the system. A single person is not enough to keep the tags (or whatever we end up using) up to date, though.
Your statuses only apply to PRs, too, and we need to have something similar for issues, which have maybe these statuses
* random idea / feature request * feature request with consensus to implement * possible bug * confirmed bug * feature request or bug with active PR * feature request or bug with stale PR
One problem with these is that man feature requests never get any comments, similar for PRs. Is a PR without comment waiting for review? Or in dispute? A PR could be reviewed but dispute could happen later, as we don't always agree on what to do.
I agree that we should try to organize ourselves better. I'm doubtful the new github features will help. They certainly already have tremendously hindered me in keeping up in the couple of hours they've been online.
There is still no way to mark a comment as addressed, and comments are still more or less randomly hidden (and links to them become dead). Both of these issues are fixed in the other review platforms.
I don't think we are the intended users of github, though I'm not sure who is.
On 09/15/2016 07:14 PM, Joel Nothman wrote: One of the biggest issues with scikit-learn as a project is managing its backlog of issues; another is release scheduling. Some of this cannot be fixed as long as our model of voluntary contribution (with a couple of important exceptions) does not change. However, it may be worth considering the new project management features in Github.
At the moment we have the following management: * labels corresponding to type (bug, enhancement, new feat, question), scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), status/scheduling (needs contributor, needs review, sprint). * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2]
Firstly, we might benefit from prefixing labels by category, i.e. difficulty:easy so that complementary labels appear together.
In truth, PRs have roughly these statuses: * WIP (not ready for review) * waiting for review * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation)
New github features:
* reviews with "approved" or "request changes". A list of approvers can be found in the merge/CI panel. We could replace the MRG+1 annotation with this and use it to track disputation too. I'm not sure how it works with changes that are added after approval. I think it would have avoided one improper merge by me... One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works). * Milestone prioritising: issues in a milestone, such as https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with drag-and-drop. I think this could help with release scheduling as it would allow us to identify the top priorities for a release and see when enough of them are completed. * The Kanban-style workflow management of the new Projects toolhttps://github.com/scikit-learn/scikit-learn/projects is another way of managing status and, I think, priority, for a small set of related issues. This might be an alternative way of managing milestone scope, or of working towards big changes like the one just completed for model selection; like proposed expansions to get_feature_names expansion; like estimator tags; making utilities public/private...
So with the goal of making it easier to track where attention is most needed, and when to move to release: What's worth trying?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments. _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
While I appreciate that we are somewhat arbitrarily supporting a near-monopoly, the case for moving away from, or even wrapping, github seems poor to me.
Yeah, I would that moving away from GitHub involves probably too much hassle given the size of the project. Also, I don’t think there are any good alternatives besides BitBucket, which also would be not a good choice for such a big project due to its pricing structure — they have a simple yet useful “priority” attribute for issues though. Not sure, but it feels like GitHub is currently in a somewhat experimental stage regarding their web UI — feels like they are changing a bit too much, too often frequently. However, maybe (or hopefully) they'll address a few of the recent annoyances in future due to user feedback. Using a wrapper seems like a good idea right now, but who knows whether or not these wrapper will introduce changes as well in near future.
either through the milestone feature, the project feature
I think the milestone feature is pretty useful. Have seen this in several other projects (e.g., matplotlib). As a user/sometimes contributor, it would help with focussing on more important issues; I am sometimes a bit hesitant to submit/tackle pull requests or issues since I feel like they are somewhat distracting the core contributors from the more important stuff. Best, Sebastian
On Sep 16, 2016, at 9:11 AM, Sebastian Raschka <se.raschka@gmail.com> wrote:
Scikit-learn’s GitHub repo already makes use of these templates. I think the issue is more a technical one arising from their latest “style” changes.
On Sep 16, 2016, at 8:25 AM, Dale T Smith <Dale.T.Smith@macys.com> wrote:
A form – with required, pre-defined fields – can help when people submit bugs, issues, or requests for new features. Perhaps creating an issue template for scikit-learn is a good first step.
https://help.github.com/articles/creating-an-issue-template-for-your-reposit...
Pull requests also have a template
https://help.github.com/articles/creating-a-pull-request-template-for-your-r...
I am not sure how these fit into the team’s review and release workflow.
If this doesn’t quite fit your needs, perhaps engaging Github Support will yield something interesting.
__________________________________________________________________________________________ Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science 770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.smith@macys.com
From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys.com@python.org] On Behalf Of Joel Nothman Sent: Friday, September 16, 2016 1:15 AM To: Scikit-learn user and developer mailing list Subject: Re: [scikit-learn] Github project management tools
⚠ EXT MSG: I think we're quite close to the intended users of Github, they just started simple and with all these more feature-complete competitors appear, are adding those features but haven't quite got it right yet. I'm not convinced that it's the perfect tool (although I haven't seen this threading problem; gmail seems to still be keeping one thread per PR?), but its simplicity and familiarity/popularity is a great advantage for handling new contributors. In terms of contributor familiarity, most of the projects that we integrate with use same: numpy, scipy, cython (recently), pandas, matplotlib, ipython. While I appreciate that we are somewhat arbitrarily supporting a near-monopoly, the case for moving away from, or even wrapping, github seems poor to me.
Apart from distinguishing between possible bug, actual bug and other (which are fairly static categories), classifying issues by status is too hard to manage. What I'd like to suggest is that we choose a way to highlight high-priority issues for the next release, either through the milestone feature, the project feature. Other issues will still get attention by way of random traffic, but we care less about the timing of their resolution.
(I'm sure there must be a way using the API to find issues linked to by PRs or not, but I don't think that's available in the UI.)
On 16 September 2016 at 09:35, Andreas Mueller <t3kcit@gmail.com> wrote: Hey Joel. Thanks for bringing this up. I have a really hard time keeping up with what's happening on the issue tracker and I have no idea how you manage.
The current tags are certainly not always helpful. Also, they are rarely updated.
I have been very frustrated by github. I used email to track all issues, but their new "upgrade" made that impossible as issues are no longer email threads - each review is it's own thread.
It might make sense to switch to something like reviewable or gerrit. These sit on top of github, and people can interact with them without using them. I haven't really worked with either, but heard only good things about them.
Any way to prioritize issues and putting them into the buckets that you listed would be a great step forward. That would require someone manually going through 470 PRs and 762 issues, though. I would be happy to do that if we actually stick to the system. A single person is not enough to keep the tags (or whatever we end up using) up to date, though.
Your statuses only apply to PRs, too, and we need to have something similar for issues, which have maybe these statuses
* random idea / feature request * feature request with consensus to implement * possible bug * confirmed bug * feature request or bug with active PR * feature request or bug with stale PR
One problem with these is that man feature requests never get any comments, similar for PRs. Is a PR without comment waiting for review? Or in dispute? A PR could be reviewed but dispute could happen later, as we don't always agree on what to do.
I agree that we should try to organize ourselves better. I'm doubtful the new github features will help. They certainly already have tremendously hindered me in keeping up in the couple of hours they've been online.
There is still no way to mark a comment as addressed, and comments are still more or less randomly hidden (and links to them become dead). Both of these issues are fixed in the other review platforms.
I don't think we are the intended users of github, though I'm not sure who is.
On 09/15/2016 07:14 PM, Joel Nothman wrote: One of the biggest issues with scikit-learn as a project is managing its backlog of issues; another is release scheduling. Some of this cannot be fixed as long as our model of voluntary contribution (with a couple of important exceptions) does not change. However, it may be worth considering the new project management features in Github.
At the moment we have the following management: * labels corresponding to type (bug, enhancement, new feat, question), scope (API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), status/scheduling (needs contributor, needs review, sprint). * PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2]
Firstly, we might benefit from prefixing labels by category, i.e. difficulty:easy so that complementary labels appear together.
In truth, PRs have roughly these statuses: * WIP (not ready for review) * waiting for review * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation)
New github features:
* reviews with "approved" or "request changes". A list of approvers can be found in the merge/CI panel. We could replace the MRG+1 annotation with this and use it to track disputation too. I'm not sure how it works with changes that are added after approval. I think it would have avoided one improper merge by me... One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works). * Milestone prioritising: issues in a milestone, such as https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with drag-and-drop. I think this could help with release scheduling as it would allow us to identify the top priorities for a release and see when enough of them are completed. * The Kanban-style workflow management of the new Projects toolhttps://github.com/scikit-learn/scikit-learn/projects is another way of managing status and, I think, priority, for a small set of related issues. This might be an alternative way of managing milestone scope, or of working towards big changes like the one just completed for model selection; like proposed expansions to get_feature_names expansion; like estimator tags; making utilities public/private...
So with the goal of making it easier to track where attention is most needed, and when to move to release: What's worth trying?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments. _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 09/16/2016 01:14 AM, Joel Nothman wrote:
I think we're quite close to the intended users of Github, they just started simple and with all these more feature-complete competitors appear, are adding those features but haven't quite got it right yet. I'm not convinced that it's the perfect tool (although I haven't seen this threading problem; gmail seems to still be keeping one thread per PR?), but its simplicity and familiarity/popularity is a great advantage for handling new contributors. In terms of contributor familiarity, most of the projects that we integrate with use same: numpy, scipy, cython (recently), pandas, matplotlib, ipython. While I appreciate that we are somewhat arbitrarily supporting a near-monopoly, the case for moving away from, or even wrapping, github seems poor to me.
Actually, both of these services don't require everyone to be using them. The contributors could still be using github and get all the normal functionality. It would just give us a better way to track things.
Apart from distinguishing between possible bug, actual bug and other (which are fairly static categories), classifying issues by status is too hard to manage. What I'd like to suggest is that we choose a way to highlight high-priority issues for the next release, either through the milestone feature, the project feature. Other issues will still get attention by way of random traffic, but we care less about the timing of their resolution.
Yeah, actually updating statuses is a lot of work, as I found out with the "need contributor" tag. However, I think this might be the most helpful tag for people wanting to contribute. I'm happy with using the release tags more.
On Fri, Sep 16, 2016 at 09:14:12AM +1000, Joel Nothman wrote:
One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works).
Yes, I do that a lot. So this is not a great improvement for me. G
On 17 September 2016 at 01:21, Gael Varoquaux <gael.varoquaux@normalesup.org
wrote:
On Fri, Sep 16, 2016 at 09:14:12AM +1000, Joel Nothman wrote:
One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works).
Yes, I do that a lot. So this is not a great improvement for me.
A lot of the new features, including this, do not seem to have Github APIs (or at least documentation) yet. When we adopted title hacking, PRs could not receive labels. *Would labels be an improvement over title hacking for recording approval status?* I think it would be worth trying to have a rough *priority ranking for things we'd like to see in 0.19*. However the Github Milestones feature is a bit crippled in UI: you can rank issues, but cannot filter by anything but open/closed, so for instance cannot see bugs and non-bugs separately. Perhaps Projects come to supersede that, although I think they work best for small-scale sprints rather than release-level milestones. And you cannot search sorted by milestone priority. Apart from an interface for manual prioritising, I think we would benefit from *automatic labelling*: * of issues to say when a PR mentioning the issue exists * of PRs to say whether there's been 1 or 2 LGTMs by core devs There are a number of issue labelling bots around -- https://github.com/ botdylan/botdylan seems to be one of the more configurable -- but hosted solutions don't seem readily available. Does anyone know of strong preferences for tracking + labelling bot solutions? waffle.io seems to go in this direction but is relatively inflexible.
On Mon, Sep 19, 2016 at 4:06 PM Joel Nothman <joel.nothman@gmail.com> wrote:
I think it would be worth trying to have a rough *priority ranking for things we'd like to see in 0.19*. However the Github Milestones feature is a bit crippled in UI: you can rank issues, but cannot filter by anything but open/closed, so for instance cannot see bugs and non-bugs separately. Perhaps Projects come to supersede that, although I think they work best for small-scale sprints rather than release-level milestones. And you cannot search sorted by milestone priority.
You can use the search text field to build complex queries like this: https://github.com/scikit-learn/scikit-learn/issues?q=is%3Aissue+is%3Aopen+m... I once had a nice set of example queries but seem to have lost it so the best docs I can point to is: https://help.github.com/articles/searching-issues/ HTH, T
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...! On 20 September 2016 at 00:05, Joel Nothman <joel.nothman@gmail.com> wrote:
On 17 September 2016 at 01:21, Gael Varoquaux < gael.varoquaux@normalesup.org> wrote:
On Fri, Sep 16, 2016 at 09:14:12AM +1000, Joel Nothman wrote:
One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works).
Yes, I do that a lot. So this is not a great improvement for me.
A lot of the new features, including this, do not seem to have Github APIs (or at least documentation) yet. When we adopted title hacking, PRs could not receive labels. *Would labels be an improvement over title hacking for recording approval status?*
I think it would be worth trying to have a rough *priority ranking for things we'd like to see in 0.19*. However the Github Milestones feature is a bit crippled in UI: you can rank issues, but cannot filter by anything but open/closed, so for instance cannot see bugs and non-bugs separately. Perhaps Projects come to supersede that, although I think they work best for small-scale sprints rather than release-level milestones. And you cannot search sorted by milestone priority.
Apart from an interface for manual prioritising, I think we would benefit from *automatic labelling*: * of issues to say when a PR mentioning the issue exists * of PRs to say whether there's been 1 or 2 LGTMs by core devs
There are a number of issue labelling bots around -- https://github.com/botdylan/botdylan seems to be one of the more configurable -- but hosted solutions don't seem readily available.
Does anyone know of strong preferences for tracking + labelling bot solutions? waffle.io seems to go in this direction but is relatively inflexible.
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...!
If PRs are inactive, it might also be interesting to tag them as easy_fix when there is little to do.
On 20 September 2016 at 00:05, Joel Nothman <joel.nothman@gmail.com> wrote:
On 17 September 2016 at 01:21, Gael Varoquaux <gael.varoquaux@normalesup.org> wrote:
On Fri, Sep 16, 2016 at 09:14:12AM +1000, Joel Nothman wrote:
One downside is that there does not yet seem to be a way to search for PRs with a specified level of approval (while searching for "MRG+1" sort-of works).
Yes, I do that a lot. So this is not a great improvement for me.
A lot of the new features, including this, do not seem to have Github APIs (or at least documentation) yet. When we adopted title hacking, PRs could not receive labels. Would labels be an improvement over title hacking for recording approval status?
I think it would be worth trying to have a rough priority ranking for things we'd like to see in 0.19. However the Github Milestones feature is a bit crippled in UI: you can rank issues, but cannot filter by anything but open/closed, so for instance cannot see bugs and non-bugs separately. Perhaps Projects come to supersede that, although I think they work best for small-scale sprints rather than release-level milestones. And you cannot search sorted by milestone priority.
Apart from an interface for manual prioritising, I think we would benefit from automatic labelling: * of issues to say when a PR mentioning the issue exists * of PRs to say whether there's been 1 or 2 LGTMs by core devs
There are a number of issue labelling bots around -- https://github.com/botdylan/botdylan seems to be one of the more configurable -- but hosted solutions don't seem readily available.
Does anyone know of strong preferences for tracking + labelling bot solutions? waffle.io seems to go in this direction but is relatively inflexible.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 09/19/2016 09:56 PM, Nelle Varoquaux wrote:
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...! That kind of only works when the status is "waiting for changes", and not "waiting for reviews". I guess we could tag all old issues or use the new interface (though you said that's not scriptable yet?) So we would need to actually use the "needs reviews" tag and add an "waiting for changes" tag. And I guess the "waiting for changes" should be removed automatically when the author changed something and changed to "needs review"?
Is there an API to access the "fixes #ISSUE" thing for auto-closing? Just mentioning an issue doesn't mean it's a PR to solve the issue.
If PRs are inactive, it might also be interesting to tag them as easy_fix when there is little to do.
That's much harder to automate though. I know that I often misjudge the amount that is left to do in a PR, not sure if bots are better at that than humans yet. Are there bots with LSTM support yet? ;)
On 21 September 2016 at 22:13, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/19/2016 09:56 PM, Nelle Varoquaux wrote:
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...!
That kind of only works when the status is "waiting for changes", and not "waiting for reviews". I guess we could tag all old issues or use the new interface (though you said that's not scriptable yet?) So we would need to actually use the "needs reviews" tag and add an "waiting for changes" tag. And I guess the "waiting for changes" should be removed automatically when the author changed something and changed to "needs review"?
Is there an API to access the "fixes #ISSUE" thing for auto-closing? Just mentioning an issue doesn't mean it's a PR to solve the issue.
If PRs are inactive, it might also be interesting to tag them as easy_fix when there is little to do.
That's much harder to automate though. I know that I often misjudge the amount that is left to do in a PR, not sure if bots are better at that than humans yet.
Bots wouldn't be able to do that, but I find that an hour now and then scrolling throught old PR works pretty well :)
Are there bots with LSTM support yet? ;)
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
So following up on this conversation, do we want to use status labels more consistently? And what should they be? Joel Proposed for PRs: * WIP (not ready for review) * waiting for review [we have a tag for this] * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation) We could at least add tags for "waiting for changes" and "in dispute", which are fairly clear categories. For PRs we should probably add [bug - not confirmed] and [bug - confirmed] On 09/22/2016 01:23 AM, Nelle Varoquaux wrote:
On 21 September 2016 at 22:13, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/19/2016 09:56 PM, Nelle Varoquaux wrote:
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...! That kind of only works when the status is "waiting for changes", and not "waiting for reviews". I guess we could tag all old issues or use the new interface (though you said that's not scriptable yet?) So we would need to actually use the "needs reviews" tag and add an "waiting for changes" tag. And I guess the "waiting for changes" should be removed automatically when the author changed something and changed to "needs review"?
Is there an API to access the "fixes #ISSUE" thing for auto-closing? Just mentioning an issue doesn't mean it's a PR to solve the issue.
If PRs are inactive, it might also be interesting to tag them as easy_fix when there is little to do.
That's much harder to automate though. I know that I often misjudge the amount that is left to do in a PR, not sure if bots are better at that than humans yet.
Bots wouldn't be able to do that, but I find that an hour now and then scrolling throught old PR works pretty well :)
Are there bots with LSTM support yet? ;)
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Maybe something for "stalled" pull requests? e.g. if someone hasn't worked on their PR in say 30 days and it's tagged "waiting for changes", you could ping them and then put on the "stalled" label. If they don't respond in another 15 days / say they aren't working on it anymore, maybe it'd be good to change to "abandoned" or "need contributor" (and add "need contributor" to the linked issue, if applicable) to indicate that someone else can pick it up. On Wed, Sep 28, 2016 at 10:01 AM, Andreas Mueller <t3kcit@gmail.com> wrote:
So following up on this conversation, do we want to use status labels more consistently? And what should they be? Joel Proposed for PRs:
* WIP (not ready for review) * waiting for review [we have a tag for this] * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation)
We could at least add tags for "waiting for changes" and "in dispute", which are fairly clear categories.
For PRs we should probably add [bug - not confirmed] and [bug - confirmed]
On 09/22/2016 01:23 AM, Nelle Varoquaux wrote:
On 21 September 2016 at 22:13, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/19/2016 09:56 PM, Nelle Varoquaux wrote:
Another bot-able tool might be pinging inactive PRs to ask if they're
being worked on, and labelling "Needs contributor" if there's no reply within n days...!
That kind of only works when the status is "waiting for changes", and not "waiting for reviews". I guess we could tag all old issues or use the new interface (though you said that's not scriptable yet?) So we would need to actually use the "needs reviews" tag and add an "waiting for changes" tag. And I guess the "waiting for changes" should be removed automatically when the author changed something and changed to "needs review"?
Is there an API to access the "fixes #ISSUE" thing for auto-closing? Just mentioning an issue doesn't mean it's a PR to solve the issue.
If PRs are inactive, it might also be interesting to tag them as
easy_fix when there is little to do.
That's much harder to automate though. I know that I often misjudge the amount that is left to do in a PR, not sure if bots are better at that than humans yet.
Bots wouldn't be able to do that, but I find that an hour now and then scrolling throught old PR works pretty well :)
Are there bots with LSTM support yet? ;)
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 28 September 2016 at 10:09, Nelson Liu <nfliu@uw.edu> wrote:
Maybe something for "stalled" pull requests? e.g. if someone hasn't worked on their PR in say 30 days and it's tagged "waiting for changes", you could ping them and then put on the "stalled" label. If they don't respond in another 15 days / say they aren't working on it anymore, maybe it'd be good to change to "abandoned" or "need contributor" (and add "need contributor" to the linked issue, if applicable) to indicate that someone else can pick it up.
On Wed, Sep 28, 2016 at 10:01 AM, Andreas Mueller <t3kcit@gmail.com> wrote:
So following up on this conversation, do we want to use status labels more consistently? And what should they be? Joel Proposed for PRs:
* WIP (not ready for review) * waiting for review [we have a tag for this] * waiting for changes (with or without one of the following) * in dispute (i.e. fundamental doubts about the PR) * the above together with 1 or 2 "official" approvals * ready for merge (pending minor changes such as what's new documentation)
We could at least add tags for "waiting for changes" and "in dispute", which are fairly clear categories.
For PRs we should probably add [bug - not confirmed] and [bug - confirmed]
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently: - stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often]. WIP, MRG, MRG+N in the title seem IMO a better way to do this. It is easier than to do with tagging *and* the author of the PR can edit its own title (which is useful for the two first). On matplotlib, we have tags for "need review", "need change" and they are IMO useless. The "need review" is added automatically by a bot as soon as a PR is opened. It tags PR that are WIP as needed review and pretty much all of the PRs have it, as no one removes this tag. The "need change" is used very sparsely because no one ever bothers to put it (understand as "tagging is annoying" through the UI).
On 09/22/2016 01:23 AM, Nelle Varoquaux wrote:
On 21 September 2016 at 22:13, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/19/2016 09:56 PM, Nelle Varoquaux wrote:
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...!
That kind of only works when the status is "waiting for changes", and not "waiting for reviews". I guess we could tag all old issues or use the new interface (though you said that's not scriptable yet?) So we would need to actually use the "needs reviews" tag and add an "waiting for changes" tag. And I guess the "waiting for changes" should be removed automatically when the author changed something and changed to "needs review"?
Is there an API to access the "fixes #ISSUE" thing for auto-closing? Just mentioning an issue doesn't mean it's a PR to solve the issue.
If PRs are inactive, it might also be interesting to tag them as easy_fix when there is little to do.
That's much harder to automate though. I know that I often misjudge the amount that is left to do in a PR, not sure if bots are better at that than humans yet.
Bots wouldn't be able to do that, but I find that an hour now and then scrolling throught old PR works pretty well :)
Are there bots with LSTM support yet? ;)
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently:
- stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one. It would be great to have some way to get through the backlog of 400 PRs and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy. For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs.
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently:
- stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
It would be great to have some way to get through the backlog of 400 PRs and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy.
For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs.
That sounds like a great use of labels, thought all of these need to be tagged manually.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
So I made a project for 0.19: https://github.com/scikit-learn/scikit-learn/projects/5 The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is pretty annoying. Thoughts? On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently:
- stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
It would be great to have some way to get through the backlog of 400 PRs and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy.
For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs. That sounds like a great use of labels, thought all of these need to be tagged manually.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
I agree that being able to identify which PRs are stalled on the contributor's part, which on reviewers' part, and since when, would be great. I'm not sure we've come up with a way that'll work though. In terms of backlog, I've wondered if just getting things into a spreadsheet would help: https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y... What other features of an Issue / PR would be useful to sort/filter/pivottable on in a spreadsheet form like this? (It would be extra nice if we could modify titles and labels within the spreadsheet and have them update via the GitHub API, but I'm not sure I'll get around to making that feature :P) On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com> wrote:
So I made a project for 0.19:
https://github.com/scikit-learn/scikit-learn/projects/5
The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is pretty annoying. Thoughts?
On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently:
- stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
It would be great to have some way to get through the backlog of 400 PRs
and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy.
For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs.
That sounds like a great use of labels, thought all of these need to be tagged manually.
_______________________________________________
scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
I hope this isn't out of place but I notice that https://github.com/scikit-learn/scikit-learn/pull/4899 is not in the list. It seems like a very worthwhile addition and the PR appears stalled at present. Raphael On 29 September 2016 at 15:05, Joel Nothman <joel.nothman@gmail.com> wrote:
I agree that being able to identify which PRs are stalled on the contributor's part, which on reviewers' part, and since when, would be great. I'm not sure we've come up with a way that'll work though.
In terms of backlog, I've wondered if just getting things into a spreadsheet would help:
https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y...
What other features of an Issue / PR would be useful to sort/filter/pivottable on in a spreadsheet form like this?
(It would be extra nice if we could modify titles and labels within the spreadsheet and have them update via the GitHub API, but I'm not sure I'll get around to making that feature :P)
On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com> wrote:
So I made a project for 0.19:
https://github.com/scikit-learn/scikit-learn/projects/5
The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is pretty annoying. Thoughts?
On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently:
- stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
It would be great to have some way to get through the backlog of 400 PRs and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy.
For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs.
That sounds like a great use of labels, thought all of these need to be tagged manually.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
My apologies I see it is in the spreadsheet. It would be great to see this work finished for 0.19 if at all possible IMHO. Raphael On 29 September 2016 at 20:12, Raphael C <drraph@gmail.com> wrote:
I hope this isn't out of place but I notice that https://github.com/scikit-learn/scikit-learn/pull/4899 is not in the list. It seems like a very worthwhile addition and the PR appears stalled at present.
Raphael
On 29 September 2016 at 15:05, Joel Nothman <joel.nothman@gmail.com> wrote:
I agree that being able to identify which PRs are stalled on the contributor's part, which on reviewers' part, and since when, would be great. I'm not sure we've come up with a way that'll work though.
In terms of backlog, I've wondered if just getting things into a spreadsheet would help:
https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y...
What other features of an Issue / PR would be useful to sort/filter/pivottable on in a spreadsheet form like this?
(It would be extra nice if we could modify titles and labels within the spreadsheet and have them update via the GitHub API, but I'm not sure I'll get around to making that feature :P)
On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com> wrote:
So I made a project for 0.19:
https://github.com/scikit-learn/scikit-learn/projects/5
The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is pretty annoying. Thoughts?
On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> wrote:
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
I think the only ones worth having are the ones that can be dealt with automatically and the ones that will not be used frequently:
- stalled after 30 days of inactivity [can be done automatically] - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
It would be great to have some way to get through the backlog of 400 PRs and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy.
For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs.
That sounds like a great use of labels, thought all of these need to be tagged manually.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
The spreadsheet seems to have some duplications and presumably some missing rows, with apologies. I assume some is due to the github pagination, and some may be my error. Not a big enough error to fix up. On 30 September 2016 at 05:15, Raphael C <drraph@gmail.com> wrote:
My apologies I see it is in the spreadsheet. It would be great to see this work finished for 0.19 if at all possible IMHO.
Raphael
I hope this isn't out of place but I notice that https://github.com/scikit-learn/scikit-learn/pull/4899 is not in the list. It seems like a very worthwhile addition and the PR appears stalled at present.
Raphael
On 29 September 2016 at 15:05, Joel Nothman <joel.nothman@gmail.com> wrote:
I agree that being able to identify which PRs are stalled on the contributor's part, which on reviewers' part, and since when, would be great. I'm not sure we've come up with a way that'll work though.
In terms of backlog, I've wondered if just getting things into a spreadsheet would help:
https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929Jp Ae9958YxKCubjE/edit
What other features of an Issue / PR would be useful to sort/filter/pivottable on in a spreadsheet form like this?
(It would be extra nice if we could modify titles and labels within the spreadsheet and have them update via the GitHub API, but I'm not sure I'll get around to making that feature :P)
On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com> wrote:
So I made a project for 0.19:
https://github.com/scikit-learn/scikit-learn/projects/5
The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is
On 29 September 2016 at 20:12, Raphael C <drraph@gmail.com> wrote: pretty
annoying. Thoughts?
On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com>
wrote:
On 09/28/2016 02:21 PM, Nelle Varoquaux wrote: > > > I think the only ones worth having are the ones that can be dealt
with
> automatically and the ones that will not be used frequently: > > - stalled after 30 days of inactivity [can be done automatically] > - in dispute [I don't expect it to be used often].
I think "in dispute" is actually one of the most common statuses among PRs. Or maybe I have a skewed picture of things. Many PRs stalled because it is not clear whether the proposed solution is a good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
It would be great to have some way to get through the backlog of 400 PRs and I think tagging them might be useful. We rarely reject PRs, we could also revisit that policy.
For the backlog, it's pretty unclear to me how many are waiting for reviews, how many are waiting for changes, and how many are disputed. Tagging these might help people who want to review to find things to review, and people who want to code to pick up stalled PRs.
That sounds like a great use of labels, thought all of these need to be tagged manually.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
So did we ever decide on how to prioritize reviews? (I was still mentally / notification catching up after 0.18.1) There are some really important issues to tackle, often with proposed solutions, not no reviews! It's hard for everybody to keep the big picture in mind with such a full issue tracker. I think it might be helpful if Joel and me prioritize issues. Obviously that will only make sense if the other team members check up on it when deciding what to review / work on. Do we want to try to seriously use the project feature? https://github.com/scikit-learn/scikit-learn/projects/5 On my monitor I can fit four columns and the "add cards" tab. I tried using five columns (separating in-progress and stalled PRs) but then I could access the right-most column when the "add cards" was open. The whole interface is a bit awkward but maybe the best we have (for example moving something from the bottom to the top is easiest by moving it to a different column, then scrolling up, then moving it back) wdyt? Andy On 09/29/2016 11:05 PM, Joel Nothman wrote:
The spreadsheet seems to have some duplications and presumably some missing rows, with apologies. I assume some is due to the github pagination, and some may be my error. Not a big enough error to fix up.
On 30 September 2016 at 05:15, Raphael C <drraph@gmail.com <mailto:drraph@gmail.com>> wrote:
My apologies I see it is in the spreadsheet. It would be great to see this work finished for 0.19 if at all possible IMHO.
Raphael
On 29 September 2016 at 20:12, Raphael C <drraph@gmail.com <mailto:drraph@gmail.com>> wrote: > I hope this isn't out of place but I notice that > https://github.com/scikit-learn/scikit-learn/pull/4899 <https://github.com/scikit-learn/scikit-learn/pull/4899> is not in the > list. It seems like a very worthwhile addition and the PR appears > stalled at present. > > Raphael > > On 29 September 2016 at 15:05, Joel Nothman <joel.nothman@gmail.com <mailto:joel.nothman@gmail.com>> wrote: >> I agree that being able to identify which PRs are stalled on the >> contributor's part, which on reviewers' part, and since when, would be >> great. I'm not sure we've come up with a way that'll work though. >> >> In terms of backlog, I've wondered if just getting things into a spreadsheet >> would help: >> >> https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y... <https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y...> >> >> What other features of an Issue / PR would be useful to >> sort/filter/pivottable on in a spreadsheet form like this? >> >> (It would be extra nice if we could modify titles and labels within the >> spreadsheet and have them update via the GitHub API, but I'm not sure I'll >> get around to making that feature :P) >> >> >> On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com <mailto:t3kcit@gmail.com>> wrote: >>> >>> So I made a project for 0.19: >>> >>> https://github.com/scikit-learn/scikit-learn/projects/5 <https://github.com/scikit-learn/scikit-learn/projects/5> >>> >>> The idea would be to drag and drop issues and PRs so that the important >>> ones are at the top. >>> We could also add an "important" column, currently the scrolling is pretty >>> annoying. >>> Thoughts? >>> >>> >>> >>> >>> On 09/28/2016 03:29 PM, Nelle Varoquaux wrote: >>>> >>>> On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com <mailto:t3kcit@gmail.com>> wrote: >>>>> >>>>> >>>>> On 09/28/2016 02:21 PM, Nelle Varoquaux wrote: >>>>>> >>>>>> >>>>>> I think the only ones worth having are the ones that can be dealt with >>>>>> automatically and the ones that will not be used frequently: >>>>>> >>>>>> - stalled after 30 days of inactivity [can be done automatically] >>>>>> - in dispute [I don't expect it to be used often]. >>>>> >>>>> I think "in dispute" is actually one of the most common statuses among >>>>> PRs. >>>>> Or maybe I have a skewed picture of things. >>>>> Many PRs stalled because it is not clear whether the proposed solution >>>>> is a >>>>> good one. >>>> >>>> On the stalled one, sure, but there are a lot of PRs being merged >>>> fairly quickly. So over all, I think it is quite rare. No? >>>> >>>>> It would be great to have some way to get through the backlog of 400 PRs >>>>> and >>>>> I think tagging them might be useful. >>>>> We rarely reject PRs, we could also revisit that policy. >>>>> >>>>> For the backlog, it's pretty unclear to me how many are waiting for >>>>> reviews, >>>>> how many are waiting for changes, >>>>> and how many are disputed. >>>>> Tagging these might help people who want to review to find things to >>>>> review, >>>>> and people who want to code to pick >>>>> up stalled PRs. >>>> >>>> That sounds like a great use of labels, thought all of these need to >>>> be tagged manually. >>>> >>>>> _______________________________________________ >>>>> scikit-learn mailing list >>>>> scikit-learn@python.org <mailto:scikit-learn@python.org> >>>>> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn> >>>> >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org <mailto:scikit-learn@python.org> >>>> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn> >>> >>> >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org <mailto:scikit-learn@python.org> >>> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn> >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org <mailto:scikit-learn@python.org> >> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn> >> _______________________________________________ scikit-learn mailing list scikit-learn@python.org <mailto:scikit-learn@python.org> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hello, This seems a good moment to say that we will be starting a project at BIDS next semester to try extract information from github and classify PRs into different categories (stalled, updated, needs review). Stéfan drafted a list of elements he would like to see for scikit-image, and I have been wanting something similar for matplotlib. I've got my hands full right now, but we are more than open to discuss with the wider community to see if such a tool would be useful and what features is of interest. Here are some examples of elements we'd like to be able to identify and sort: - Most active pull requests “hot topics” - The one where "I" have commented on. - PRs that haven’t seen any discussion. - Stalled PRs. - New issues without any comments. - See the old PRs that could be merged - Recently merged PR referring to a ticket but haven’t closed that ticket. - Duplicate PR (closing the same ticket). - Tickets that being referred to many times. - Unmergeable PRs (that need to be rebased). - PRs that passed the majority of tests. - Issues that external projects refer too. Do you think something like this could be interesting for sklearn? Also, if you have scripts that similar things and that you would be willing to share, we would be very happy to see what exists already out there. Cheers, N On 2 December 2016 at 16:52, Andy <t3kcit@gmail.com> wrote:
So did we ever decide on how to prioritize reviews? (I was still mentally / notification catching up after 0.18.1)
There are some really important issues to tackle, often with proposed solutions, not no reviews! It's hard for everybody to keep the big picture in mind with such a full issue tracker. I think it might be helpful if Joel and me prioritize issues. Obviously that will only make sense if the other team members check up on it when deciding what to review / work on.
Do we want to try to seriously use the project feature? https://github.com/scikit-learn/scikit-learn/projects/5
On my monitor I can fit four columns and the "add cards" tab. I tried using five columns (separating in-progress and stalled PRs) but then I could access the right-most column when the "add cards" was open. The whole interface is a bit awkward but maybe the best we have (for example moving something from the bottom to the top is easiest by moving it to a different column, then scrolling up, then moving it back)
wdyt? Andy
On 09/29/2016 11:05 PM, Joel Nothman wrote:
The spreadsheet seems to have some duplications and presumably some missing rows, with apologies. I assume some is due to the github pagination, and some may be my error. Not a big enough error to fix up.
On 30 September 2016 at 05:15, Raphael C <drraph@gmail.com> wrote:
My apologies I see it is in the spreadsheet. It would be great to see this work finished for 0.19 if at all possible IMHO.
Raphael
On 29 September 2016 at 20:12, Raphael C <drraph@gmail.com> wrote:
I hope this isn't out of place but I notice that https://github.com/scikit-learn/scikit-learn/pull/4899 is not in the list. It seems like a very worthwhile addition and the PR appears stalled at present.
Raphael
On 29 September 2016 at 15:05, Joel Nothman <joel.nothman@gmail.com> wrote:
I agree that being able to identify which PRs are stalled on the contributor's part, which on reviewers' part, and since when, would be great. I'm not sure we've come up with a way that'll work though.
In terms of backlog, I've wondered if just getting things into a spreadsheet would help:
https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y...
What other features of an Issue / PR would be useful to sort/filter/pivottable on in a spreadsheet form like this?
(It would be extra nice if we could modify titles and labels within the spreadsheet and have them update via the GitHub API, but I'm not sure I'll get around to making that feature :P)
On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com> wrote:
So I made a project for 0.19:
https://github.com/scikit-learn/scikit-learn/projects/5
The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is pretty annoying. Thoughts?
On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> wrote: > > > On 09/28/2016 02:21 PM, Nelle Varoquaux wrote: >> >> >> I think the only ones worth having are the ones that can be dealt >> with >> automatically and the ones that will not be used frequently: >> >> - stalled after 30 days of inactivity [can be done automatically] >> - in dispute [I don't expect it to be used often]. > > I think "in dispute" is actually one of the most common statuses > among > PRs. > Or maybe I have a skewed picture of things. > Many PRs stalled because it is not clear whether the proposed > solution > is a > good one.
On the stalled one, sure, but there are a lot of PRs being merged fairly quickly. So over all, I think it is quite rare. No?
> It would be great to have some way to get through the backlog of 400 > PRs > and > I think tagging them might be useful. > We rarely reject PRs, we could also revisit that policy. > > For the backlog, it's pretty unclear to me how many are waiting for > reviews, > how many are waiting for changes, > and how many are disputed. > Tagging these might help people who want to review to find things to > review, > and people who want to code to pick > up stalled PRs.
That sounds like a great use of labels, thought all of these need to be tagged manually.
> _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hey Nelle. That sounds great. My main question is how you'd expose this to the user. Will it be a separate website? A bot? Emails? Greasemonkey on top of github? Most of these could be implemented with tags that are automatically assigned by a bot, I guess. That would be quite a few tags, though, and wouldn't work well for filtering the ones I was active in. Tickets that are being referred to many times also sound more like a sorting of issues, not a tag. And some of these are more of a "notification type", like "this project has referred to this issue" is maybe something that I want to be made aware of, say by a comment on the issue (which triggers an email) or a direct email to me. Similarly I might be notified if someone forgot to close the ticket for a PR (so I can go and check whether to close it). I might want to be notified if any of my PRs become "unmergable". A comment by a bot would alert everybody though, and an email to me only me. The "PRs that haven't seen any discussion" is actually implemented in github by sorting by comments, and I recently used that. Also happy to (try to find time to) contribute code or discuss the project with you guys! To summarize, I think there are some low-hanging fruit for automatic tagging and for sending emails with notifications, and possibly for bots commenting. I expect that doing anything that involves sorting (a subset of) issues probably requires much more effort. Andy On 12/02/2016 08:04 PM, Nelle Varoquaux wrote:
Hello,
This seems a good moment to say that we will be starting a project at BIDS next semester to try extract information from github and classify PRs into different categories (stalled, updated, needs review). Stéfan drafted a list of elements he would like to see for scikit-image, and I have been wanting something similar for matplotlib. I've got my hands full right now, but we are more than open to discuss with the wider community to see if such a tool would be useful and what features is of interest.
Here are some examples of elements we'd like to be able to identify and sort:
- Most active pull requests “hot topics” - The one where "I" have commented on. - PRs that haven’t seen any discussion. - Stalled PRs. - New issues without any comments. - See the old PRs that could be merged - Recently merged PR referring to a ticket but haven’t closed that ticket. - Duplicate PR (closing the same ticket). - Tickets that being referred to many times. - Unmergeable PRs (that need to be rebased). - PRs that passed the majority of tests. - Issues that external projects refer too.
Do you think something like this could be interesting for sklearn? Also, if you have scripts that similar things and that you would be willing to share, we would be very happy to see what exists already out there.
Cheers, N
On 2 December 2016 at 16:52, Andy <t3kcit@gmail.com> wrote:
So did we ever decide on how to prioritize reviews? (I was still mentally / notification catching up after 0.18.1)
There are some really important issues to tackle, often with proposed solutions, not no reviews! It's hard for everybody to keep the big picture in mind with such a full issue tracker. I think it might be helpful if Joel and me prioritize issues. Obviously that will only make sense if the other team members check up on it when deciding what to review / work on.
Do we want to try to seriously use the project feature? https://github.com/scikit-learn/scikit-learn/projects/5
On my monitor I can fit four columns and the "add cards" tab. I tried using five columns (separating in-progress and stalled PRs) but then I could access the right-most column when the "add cards" was open. The whole interface is a bit awkward but maybe the best we have (for example moving something from the bottom to the top is easiest by moving it to a different column, then scrolling up, then moving it back)
wdyt? Andy
On 09/29/2016 11:05 PM, Joel Nothman wrote:
The spreadsheet seems to have some duplications and presumably some missing rows, with apologies. I assume some is due to the github pagination, and some may be my error. Not a big enough error to fix up.
On 30 September 2016 at 05:15, Raphael C <drraph@gmail.com> wrote:
My apologies I see it is in the spreadsheet. It would be great to see this work finished for 0.19 if at all possible IMHO.
Raphael
On 29 September 2016 at 20:12, Raphael C <drraph@gmail.com> wrote:
I hope this isn't out of place but I notice that https://github.com/scikit-learn/scikit-learn/pull/4899 is not in the list. It seems like a very worthwhile addition and the PR appears stalled at present.
Raphael
On 29 September 2016 at 15:05, Joel Nothman <joel.nothman@gmail.com> wrote:
I agree that being able to identify which PRs are stalled on the contributor's part, which on reviewers' part, and since when, would be great. I'm not sure we've come up with a way that'll work though.
In terms of backlog, I've wondered if just getting things into a spreadsheet would help:
https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958Y...
What other features of an Issue / PR would be useful to sort/filter/pivottable on in a spreadsheet form like this?
(It would be extra nice if we could modify titles and labels within the spreadsheet and have them update via the GitHub API, but I'm not sure I'll get around to making that feature :P)
On 29 September 2016 at 23:45, Andreas Mueller <t3kcit@gmail.com> wrote:
So I made a project for 0.19:
https://github.com/scikit-learn/scikit-learn/projects/5
The idea would be to drag and drop issues and PRs so that the important ones are at the top. We could also add an "important" column, currently the scrolling is pretty annoying. Thoughts?
On 09/28/2016 03:29 PM, Nelle Varoquaux wrote: > On 28 September 2016 at 12:24, Andreas Mueller <t3kcit@gmail.com> > wrote: >> >> On 09/28/2016 02:21 PM, Nelle Varoquaux wrote: >>> >>> I think the only ones worth having are the ones that can be dealt >>> with >>> automatically and the ones that will not be used frequently: >>> >>> - stalled after 30 days of inactivity [can be done automatically] >>> - in dispute [I don't expect it to be used often]. >> I think "in dispute" is actually one of the most common statuses >> among >> PRs. >> Or maybe I have a skewed picture of things. >> Many PRs stalled because it is not clear whether the proposed >> solution >> is a >> good one. > On the stalled one, sure, but there are a lot of PRs being merged > fairly quickly. So over all, I think it is quite rare. No? > >> It would be great to have some way to get through the backlog of 400 >> PRs >> and >> I think tagging them might be useful. >> We rarely reject PRs, we could also revisit that policy. >> >> For the backlog, it's pretty unclear to me how many are waiting for >> reviews, >> how many are waiting for changes, >> and how many are disputed. >> Tagging these might help people who want to review to find things to >> review, >> and people who want to code to pick >> up stalled PRs. > That sounds like a great use of labels, thought all of these need to > be tagged manually. > >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Another fun shortcoming of the project interface: If a card is already present in your project, you can not search for it (though you can ctrl+f)
On Fri, Dec 02, 2016 at 07:52:09PM -0500, Andy wrote:
So did we ever decide on how to prioritize reviews?
I don't know how to do this.
I think it might be helpful if Joel and me prioritize issues.
I think that it would be useful. Although of course different people will have different priorities (depending for instance on the type of data that we process). I guess that we can agree on a large part of the prioritization, and hence it will be useful.
Obviously that will only make sense if the other team members check up on it when deciding what to review / work on.
So, the big question is: how do we do this? Isn't there on of the many project-management extension of github that enables this?
We could start with assigning priority labels like they use in numpy... That + milestones could help us prioritize? On Sat, Dec 3, 2016 at 11:52 AM, Gael Varoquaux < gael.varoquaux@normalesup.org> wrote:
On Fri, Dec 02, 2016 at 07:52:09PM -0500, Andy wrote:
So did we ever decide on how to prioritize reviews?
I don't know how to do this.
I think it might be helpful if Joel and me prioritize issues.
I think that it would be useful. Although of course different people will have different priorities (depending for instance on the type of data that we process). I guess that we can agree on a large part of the prioritization, and hence it will be useful.
Obviously that will only make sense if the other team members check up on it when deciding what to review / work on.
So, the big question is: how do we do this? Isn't there on of the many project-management extension of github that enables this? _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- Raghav RV https://github.com/raghavrv
On 12/03/2016 12:26 PM, Raghav R V wrote:
We could start with assigning priority labels like they use in numpy... That + milestones could help us prioritize?
I feel milestones are too coarse. Or I'm using them wrong. And priority labels only work if people don't use the "high priority" all the time. There is a lot of stuff labeled "bug", which I would interpret as "highest priority" that people don't look at at all.
On 3 December 2016 at 10:08, Andy <t3kcit@gmail.com> wrote:
On 12/03/2016 12:26 PM, Raghav R V wrote:
We could start with assigning priority labels like they use in numpy... That + milestones could help us prioritize?
I feel milestones are too coarse. Or I'm using them wrong. And priority labels only work if people don't use the "high priority" all the time. There is a lot of stuff labeled "bug", which I would interpret as "highest priority" that people don't look at at all.
even milestone only work if people don't use the next milestone all the time. I think the only milestone useful is for release critical bugs, for the next release. For example, on matplotlib, I am currently only reviewing and working on tickets for the 2.0 milestone, as we're hoping to get a new candidate release out this week-end.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 12/03/2016 01:20 PM, Nelle Varoquaux wrote:
On 3 December 2016 at 10:08, Andy <t3kcit@gmail.com> wrote:
On 12/03/2016 12:26 PM, Raghav R V wrote:
We could start with assigning priority labels like they use in numpy... That + milestones could help us prioritize?
I feel milestones are too coarse. Or I'm using them wrong. And priority labels only work if people don't use the "high priority" all the time. There is a lot of stuff labeled "bug", which I would interpret as "highest priority" that people don't look at at all.
even milestone only work if people don't use the next milestone all the time. I think the only milestone useful is for release critical bugs, for the next release. For example, on matplotlib, I am currently only reviewing and working on tickets for the 2.0 milestone, as we're hoping to get a new candidate release out this week-end.
That's what I meant by "probably doing it wrong". I assign it to too often. But actually I think people mostly ignore it anyhow ;)
What do you think about splitting MRG and MRG+1 in two different column. The scrolling can get a little bit less annoying and you can have an easier view on the MRG+1 to kick them out.
I've put a column for that status in. Note: this has largely been generated with https://gist.github.com/jnothman/8eba0834acfd633c6d83b437f6f18c49 On 30 September 2016 at 00:16, Guillaume Lemaître <g.lemaitre58@gmail.com> wrote:
What do you think about splitting MRG and MRG+1 in two different column. The scrolling can get a little bit less annoying and you can have an easier view on the MRG+1 to kick them out.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
I have a question which may/may not be relevant here. The question is why don't we assign issues to the one who have asked to take this issue. This feature may give us a better picture of the current stat of the Issue. We can ping that person directly and get info regarding his progress in case of a long haul of inactivity. Regards Siddharth Gupta, Ph: 9871012292 Linkedin <https://www.linkedin.com/in/sidgupta234/> | Github <https://github.com/sidgupta234> | Codechef <https://www.codechef.com/users/sidgupta234> | Twitter <https://twitter.com/SidGupta234> | Facebook <https://www.facebook.com/profile.php?id=1483695876> On Thu, Sep 29, 2016 at 8:00 PM, Joel Nothman <joel.nothman@gmail.com> wrote:
I've put a column for that status in.
Note: this has largely been generated with https://gist.github.com/ jnothman/8eba0834acfd633c6d83b437f6f18c49
On 30 September 2016 at 00:16, Guillaume Lemaître <g.lemaitre58@gmail.com> wrote:
What do you think about splitting MRG and MRG+1 in two different column. The scrolling can get a little bit less annoying and you can have an easier view on the MRG+1 to kick them out.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
I think it's a matter of two things -- one, you can't be assigned if you aren't a member of the organization on github. Two -- linking pull requests to issues is generally visible enough (hence why it's in the PR template). We don't have issues with figuring out who is working on an issue, but rather keeping track of all of them; I don't think that would solve that problem. Nelson On Thursday, September 29, 2016, Siddharth Gupta < siddharthgupta234@gmail.com> wrote:
I have a question which may/may not be relevant here. The question is why don't we assign issues to the one who have asked to take this issue. This feature may give us a better picture of the current stat of the Issue. We can ping that person directly and get info regarding his progress in case of a long haul of inactivity.
Regards Siddharth Gupta, Ph: 9871012292 Linkedin <https://www.linkedin.com/in/sidgupta234/> | Github <https://github.com/sidgupta234> | Codechef <https://www.codechef.com/users/sidgupta234> | Twitter <https://twitter.com/SidGupta234> | Facebook <https://www.facebook.com/profile.php?id=1483695876>
On Thu, Sep 29, 2016 at 8:00 PM, Joel Nothman <joel.nothman@gmail.com <javascript:_e(%7B%7D,'cvml','joel.nothman@gmail.com');>> wrote:
I've put a column for that status in.
Note: this has largely been generated with https://gist.github.com/j nothman/8eba0834acfd633c6d83b437f6f18c49
On 30 September 2016 at 00:16, Guillaume Lemaître <g.lemaitre58@gmail.com <javascript:_e(%7B%7D,'cvml','g.lemaitre58@gmail.com');>> wrote:
What do you think about splitting MRG and MRG+1 in two different column. The scrolling can get a little bit less annoying and you can have an easier view on the MRG+1 to kick them out.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org <javascript:_e(%7B%7D,'cvml','scikit-learn@python.org');> https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org <javascript:_e(%7B%7D,'cvml','scikit-learn@python.org');> https://mail.python.org/mailman/listinfo/scikit-learn
On 29 September 2016 at 10:41, Nelson Liu <nfliu@uw.edu> wrote:
I think it's a matter of two things -- one, you can't be assigned if you aren't a member of the organization on github. Two -- linking pull requests to issues is generally visible enough (hence why it's in the PR template). We don't have issues with figuring out who is working on an issue, but rather keeping track of all of them; I don't think that would solve that problem.
On the other hand, we could do this for PRs, in particular stalled PRs. That would be a core member "responsible" for making sure this PR moves forward, both in terms of reviewing and updating the code.
Nelson
On Thursday, September 29, 2016, Siddharth Gupta <siddharthgupta234@gmail.com> wrote:
I have a question which may/may not be relevant here. The question is why don't we assign issues to the one who have asked to take this issue. This feature may give us a better picture of the current stat of the Issue. We can ping that person directly and get info regarding his progress in case of a long haul of inactivity.
Regards Siddharth Gupta, Ph: 9871012292 Linkedin | Github | Codechef | Twitter | Facebook
On Thu, Sep 29, 2016 at 8:00 PM, Joel Nothman <joel.nothman@gmail.com> wrote:
I've put a column for that status in.
Note: this has largely been generated with https://gist.github.com/jnothman/8eba0834acfd633c6d83b437f6f18c49
On 30 September 2016 at 00:16, Guillaume Lemaître <g.lemaitre58@gmail.com> wrote:
What do you think about splitting MRG and MRG+1 in two different column. The scrolling can get a little bit less annoying and you can have an easier view on the MRG+1 to kick them out.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (13)
-
Andreas Mueller -
Andy -
Dale T Smith -
Gael Varoquaux -
Guillaume Lemaître -
Joel Nothman -
Nelle Varoquaux -
Nelson Liu -
Raghav R V -
Raphael C -
Sebastian Raschka -
Siddharth Gupta -
Tim Head