My thinking about the development process
This is a bit long as I espoused as if this was a blog post to try and give background info on my thinking, etc. The TL;DR folks should start at the "Ideal Scenario" section and read to the end. P.S.: This is in Markdown and I have put it up at https://gist.github.com/brettcannon/a9c9a5989dc383ed73b4 if you want a nicer formatted version for reading. # History lesson Since I signed up for the python-dev mailing list way back in June 2002, there seems to be a cycle where we as a group come to a realization that our current software development process has not kept up with modern practices and could stand for an update. For me this was first shown when we moved from SourceForge to our own infrastructure, then again when we moved from Subversion to Mercurial (I led both of these initiatives, so it's somewhat a tradition/curse I find myself in this position yet again). And so we again find ourselves at the point of realizing that we are not keeping up with current practices and thus need to evaluate how we can improve our situation. # Where we are now Now it should be realized that we have to sets of users of our development process: contributors and core developers (the latter whom can play both roles). If you take a rough outline of our current, recommended process it goes something like this: 1. Contributor clones a repository from hg.python.org 2. Contributor makes desired changes 3. Contributor generates a patch 4. Contributor creates account on bugs.python.org and signs the [contributor agreement](https://www.python.org/psf/contrib/contrib-form/) 4. Contributor creates an issue on bugs.python.org (if one does not already exist) and uploads a patch 5. Core developer evaluates patch, possibly leaving comments through our [custom version of Rietveld](http://bugs.python.org/review/) 6. Contributor revises patch based on feedback and uploads new patch 7. Core developer downloads patch and applies it to a clean clone 8. Core developer runs the tests 9. Core developer does one last `hg pull -u` and then commits the changes to various branches I think we can all agree it works to some extent, but isn't exactly smooth. There are multiple steps in there -- in full or partially -- that can be automated. There is room to improve everyone's lives. And we can't forget the people who help keep all of this running as well. There are those that manage the SSH keys, the issue tracker, the review tool, hg.python.org, and the email system that let's use know when stuff happens on any of these other systems. The impact on them needs to also be considered. ## Contributors I see two scenarios for contributors to optimize for. There's the simple spelling mistake patches and then there's the code change patches. The former is the kind of thing that you can do in a browser without much effort and should be a no-brainer commit/reject decision for a core developer. This is what the GitHub/Bitbucket camps have been promoting their solution for solving while leaving the cpython repo alone. Unfortunately the bulk of our documentation is in the Doc/ directory of cpython. While it's nice to think about moving the devguide, peps, and even breaking out the tutorial to repos hosting on Bitbucket/GitHub, everything else is in Doc/ (language reference, howtos, stdlib, C API, etc.). So unless we want to completely break all of Doc/ out of the cpython repo and have core developers willing to edit two separate repos when making changes that impact code **and** docs, moving only a subset of docs feels like a band-aid solution that ignores the big, white elephant in the room: the cpython repo, where a bulk of patches are targeting. For the code change patches, contributors need an easy way to get a hold of the code and get their changes to the core developers. After that it's things like letting contributors knowing that their patch doesn't apply cleanly, doesn't pass tests, etc. As of right now getting the patch into the issue tracker is a bit manual but nothing crazy. The real issue in this scenario is core developer response time. ## Core developers There is a finite amount of time that core developers get to contribute to Python and it fluctuates greatly. This means that if a process can be found which allows core developers to spend less time doing mechanical work and more time doing things that can't be automated -- namely code reviews -- then the throughput of patches being accepted/rejected will increase. This also impacts any increased patch submission rate that comes from improving the situation for contributors because if the throughput doesn't change then there will simply be more patches sitting in the issue tracker and that doesn't benefit anyone. # My ideal scenario If I had an infinite amount of resources (money, volunteers, time, etc.), this would be my ideal scenario: 1. Contributor gets code from wherever; easiest to just say "fork on GitHub or Bitbucket" as they would be official mirrors of hg.python.org and are updated after every commit, but could clone hg.python.org/cpython if they wanted 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then they have browser edit access already 3. Contributor creates an account at bugs.python.org and signs the CLA 3. The contributor creates an issue at bugs.python.org (probably the one piece of infrastructure we all agree is better than the other options, although its workflow could use an update) 4. If the contributor used Bitbucket or GitHub, they send a pull request with the issue # in the PR message 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on bugs.python.org for code review 6. CI runs on the patch based on what Python versions are specified in the issue tracker, letting everyone know if it applied cleanly, passed tests on the OSs that would be affected, and also got a test coverage report 7. Core developer does a code review 8. Contributor updates their code based on the code review and the updated patch gets pulled by bugs.python.org automatically and CI runs again 9. Once the patch is acceptable and assuming the patch applies cleanly to all versions to commit to, the core developer clicks a "Commit" button, fills in a commit message and NEWS entry, and everything gets committed (if the patch can't apply cleanly then the core developer does it the old-fashion way, or maybe auto-generate a new PR which can be manually touched up so it does apply cleanly?) Basically the ideal scenario lets contributors use whatever tools and platforms that they want and provides as much automated support as possible to make sure their code is tip-top before and during code review while core developers can review and commit patches so easily that they can do their job from a beach with a tablet and some WiFi. ## Where the current proposed solutions seem to fall short ### GitHub/Bitbucket Basically GitHub/Bitbucket is a win for contributors but doesn't buy core developers that much. GitHub/Bitbucket gives contributors the easy cloning, drive-by patches, CI, and PRs. Core developers get a code review tool -- I'm counting Rietveld as deprecated after Guido's comments about the code's maintenance issues -- and push-button commits **only for single branch changes**. But for any patch that crosses branches we don't really gain anything. At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push; that still keeps me off the beach, though, so that doesn't get us the whole way. You could force people to submit two PRs, but I don't see that flying. Maybe some tool could be written that automatically handles the merge/commit across branches once the initial PR is in? Or automatically create a PR that core developers can touch up as necessary and then accept that as well? Regardless, some solution is necessary to handle branch-crossing PRs. As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's interface more, but that's personal taste. I like hg more than git, but that's also personal taste (and I consider a transition from hg to git a hassle but not a deal-breaker but also not a win). It is unfortunate, though, that under this scenario we would have to choose only one platform. It's also unfortunate both are closed-source, but that's not a deal-breaker, just a knock against if the decision is close. ### Our own infrastructure The shortcoming here is the need for developers, developers, developers! Everything outlined in the ideal scenario is totally doable on our own infrastructure with enough code and time (donated/paid-for infrastructure shouldn't be an issue). But historically that code and time has not materialized. Our code review tool is a fork that probably should be replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti maintains the issue tracker's code. We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric). IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term. # Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
On 6 December 2014 at 06:24, Donald Stufft <donald@stufft.io> wrote:
On Dec 5, 2014, at 3:04 PM, Brett Cannon <bcannon@gmail.com> wrote: <words>
This looks like a pretty good write up, seems to pretty fairly evaluate the various sides and the various concerns.
Agreed - thanks for taking this on Brett! For my part, I realised that if I want my Kallithea based proposal to work out, I actually need to *be* an upstream Kallithea contributor, so I posted to the Kallithea list laying out the kinds of features I'd be pushing for and why: http://lists.sfconservancy.org/pipermail/kallithea-general/2014q4/000060.htm... I only posted that a few minutes ago, so we'll see what the existing Kallithea contributors think of the idea :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri Dec 05 2014 at 3:24:38 PM Donald Stufft <donald@stufft.io> wrote:
On Dec 5, 2014, at 3:04 PM, Brett Cannon <bcannon@gmail.com> wrote: <words>
This looks like a pretty good write up, seems to pretty fairly evaluate the various sides and the various concerns.
Thanks! It seems like I have gotten the point across that I don't care what the solution is as long as it's a good one and that we have to look at the whole process and not just a corner of it if we want big gains.
On Dec 6, 2014, at 8:45 AM, Brett Cannon <bcannon@gmail.com> wrote:
On Fri Dec 05 2014 at 3:24:38 PM Donald Stufft <donald@stufft.io <mailto:donald@stufft.io>> wrote:
On Dec 5, 2014, at 3:04 PM, Brett Cannon <bcannon@gmail.com <mailto:bcannon@gmail.com>> wrote: <words>
This looks like a pretty good write up, seems to pretty fairly evaluate the various sides and the various concerns.
Thanks! It seems like I have gotten the point across that I don't care what the solution is as long as it's a good one and that we have to look at the whole process and not just a corner of it if we want big gains.
One potential solution is Phabricator (http://phabricator.org <http://phabricator.org/>) which is a gerrit like tool except it also works with Mercurial. It is a fully open source platform though it works on a “patch” bases rather than a pull request basis. They are also coming out with hosting for it (http://phacility.com/ <http://phacility.com/>) but that is “coming soon” and I’m not sure what the cost will be and if they’d be willing to donate to an OSS project. It makes it easier to upload a patch using a command like tool (like gerrit does) called arc. Phabricator itself is OSS and the coming soon page for phacility says that it’s easy to migrate from a hosted to a self-hosted solution. Phabricator supports hosting the repository itself but as I understand it, it also supports hosting the repository elsewhere. So it could mean that we host the repository on a platform that supports Pull Requests (as you might expect, I’m a fan of Github here) and also deploy Phabricator on top of it. I haven’t actually tried that so I’d want to play around with it to make sure this works how I believe it does, but it may be a good way to enable both pull requests (and the web editors that tend to come with those workflows) for easier changes and a different tool for more invasive changes. Terry spoke about CLAs, which is an interesting thing too, because phabricator itself has some workflow around this I believe, at least one of the examples in their tour is setting up some sort of notification about requiring a CLA. It even has a built in thing for signing legal documents (although I’m not sure if that’s acceptable to the PSF, we’d need to ask VanL I suspect). Another neat feature, although I’m not sure we’re actually setup to take advantage of it, is that if you run test coverage numbers you can report that directly inline with the review / diff to see what lines of the patch are being exercised by a test or not. I’m not sure if it’s actually workable for us but it probably should be explored a little bit to see if it is and if it might be a good solution. They also have a copy of it running which they develop phabricator itself on (https://secure.phabricator.com/ <https://secure.phabricator.com/>) though they also accept pull requests on github. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Sat, Dec 6, 2014 at 8:01 AM, Donald Stufft <donald@stufft.io> wrote:
One potential solution is Phabricator (http://phabricator.org) which is a gerrit like tool except it also works with Mercurial. It is a fully open source platform though it works on a “patch” bases rather than a pull request basis.
I've been pleasantly unsurprised with the ReviewBoard CLI tools (RBtools): * https://www.reviewboard.org/docs/rbtools/dev/ * https://www.reviewboard.org/docs/codebase/dev/contributing-patches/ * https://www.reviewboard.org/docs/manual/2.0/users/ ReviewBoard supports Markdown, {Git, Mercurial, Subversion, ... }, full-text search * https://wiki.jenkins-ci.org/display/JENKINS/Reviewboard+Plugin * [ https://wiki.jenkins-ci.org/display/JENKINS/Selenium+Plugin ] * https://github.com/saltstack/salt-testing/blob/develop/salttesting/jenkins.p... * GetPullRequestAction * https://wiki.jenkins-ci.org/display/JENKINS/saltstack-plugin (spin up an instance) * https://github.com/saltstack-formulas/jenkins-formula * https://github.com/saltstack/salt-jenkins
Terry spoke about CLAs, which is an interesting thing too, because phabricator itself has some workflow around this I believe, at least one of the examples in their tour is setting up some sort of notification about requiring a CLA. It even has a built in thing for signing legal documents (although I’m not sure if that’s acceptable to the PSF, we’d need to ask VanL I suspect). Another neat feature, although I’m not sure we’re actually setup to take advantage of it, is that if you run test coverage numbers you can report that directly inline with the review / diff to see what lines of the patch are being exercised by a test or not.
AFAIU, these are not (yet) features of ReviewBoard (which is written in Python).
I’m not sure if it’s actually workable for us but it probably should be explored a little bit to see if it is and if it might be a good solution. They also have a copy of it running which they develop phabricator itself on (https://secure.phabricator.com/) though they also accept pull requests on github.
What a good looking service.
On Sat, Dec 6, 2014 at 7:23 PM, Wes Turner <wes.turner@gmail.com> wrote:
On Sat, Dec 6, 2014 at 8:01 AM, Donald Stufft <donald@stufft.io> wrote:
One potential solution is Phabricator (http://phabricator.org) which is a gerrit like tool except it also works with Mercurial. It is a fully open source platform though it works on a “patch” bases rather than a pull request basis.
I've been pleasantly unsurprised with the ReviewBoard CLI tools (RBtools):
* https://www.reviewboard.org/docs/rbtools/dev/ * https://www.reviewboard.org/docs/codebase/dev/contributing-patches/ * https://www.reviewboard.org/docs/manual/2.0/users/
ReviewBoard supports Markdown, {Git, Mercurial, Subversion, ... }, full-text search
https://www.reviewboard.org/docs/manual/dev/extending/ * "Writing Review Board Extensions <https://www.reviewboard.org/docs/manual/dev/extending/extensions/>" * "Writing Authentication Backends <https://www.reviewboard.org/docs/manual/dev/extending/auth-backends/>"
Terry spoke about CLAs, which is an interesting thing too, because phabricator itself has some workflow around this I believe, at least one of the examples in their tour is setting up some sort of notification about requiring a CLA. It even has a built in thing for signing legal documents (although I’m not sure if that’s acceptable to the PSF, we’d need to ask VanL I suspect). Another neat feature, although I’m not sure we’re actually setup to take advantage of it, is that if you run test coverage numbers you can report that directly inline with the review / diff to see what lines of the patch are being exercised by a test or not.
AFAIU, these are not (yet) features of ReviewBoard (which is written in Python).
This lists the ReviewBoard workflow steps for a pre-commit workflow: https://www.reviewboard.org/docs/manual/dev/users/getting-started/workflow/ On Sat, Dec 6, 2014 at 7:55 PM, Wes Turner <wes.turner@gmail.com> wrote:
On Sat, Dec 6, 2014 at 7:23 PM, Wes Turner <wes.turner@gmail.com> wrote:
On Sat, Dec 6, 2014 at 8:01 AM, Donald Stufft <donald@stufft.io> wrote:
One potential solution is Phabricator (http://phabricator.org) which is a gerrit like tool except it also works with Mercurial. It is a fully open source platform though it works on a “patch” bases rather than a pull request basis.
I've been pleasantly unsurprised with the ReviewBoard CLI tools (RBtools):
* https://www.reviewboard.org/docs/rbtools/dev/ * https://www.reviewboard.org/docs/codebase/dev/contributing-patches/ * https://www.reviewboard.org/docs/manual/2.0/users/
ReviewBoard supports Markdown, {Git, Mercurial, Subversion, ... }, full-text search
https://www.reviewboard.org/docs/manual/dev/extending/
* "Writing Review Board Extensions <https://www.reviewboard.org/docs/manual/dev/extending/extensions/>" * "Writing Authentication Backends <https://www.reviewboard.org/docs/manual/dev/extending/auth-backends/>"
Terry spoke about CLAs, which is an interesting thing too, because phabricator itself has some workflow around this I believe, at least one of the examples in their tour is setting up some sort of notification about requiring a CLA. It even has a built in thing for signing legal documents (although I’m not sure if that’s acceptable to the PSF, we’d need to ask VanL I suspect). Another neat feature, although I’m not sure we’re actually setup to take advantage of it, is that if you run test coverage numbers you can report that directly inline with the review / diff to see what lines of the patch are being exercised by a test or not.
AFAIU, these are not (yet) features of ReviewBoard (which is written in Python).
Very nice, Brett. On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
And we can't forget the people who help keep all of this running as well. There are those that manage the SSH keys, the issue tracker, the review tool, hg.python.org, and the email system that let's use know when stuff happens on any of these other systems. The impact on them needs to also be considered.
It sounds like Guido would rather as much of this was done by a provider rather than relying on volunteers. That makes sense though there are concerns about control of certain assents. However, that applies only to some, like hg.python.org.
## Contributors I see two scenarios for contributors to optimize for. There's the simple spelling mistake patches and then there's the code change patches. The former is the kind of thing that you can do in a browser without much effort and should be a no-brainer commit/reject decision for a core developer. This is what the GitHub/Bitbucket camps have been promoting their solution for solving while leaving the cpython repo alone. Unfortunately the bulk of our documentation is in the Doc/ directory of cpython. While it's nice to think about moving the devguide, peps, and even breaking out the tutorial to repos hosting on Bitbucket/GitHub, everything else is in Doc/ (language reference, howtos, stdlib, C API, etc.). So unless we want to completely break all of Doc/ out of the cpython repo and have core developers willing to edit two separate repos when making changes that impact code **and** docs, moving only a subset of docs feels like a band-aid solution that ignores the big, white elephant in the room: the cpython repo, where a bulk of patches are targeting.
With your ideal scenario this would be a moot point, right? There would be no need to split out doc-related repos.
For the code change patches, contributors need an easy way to get a hold of the code and get their changes to the core developers. After that it's things like letting contributors knowing that their patch doesn't apply cleanly, doesn't pass tests, etc.
This is probably more work than it seems at first.
As of right now getting the patch into the issue tracker is a bit manual but nothing crazy. The real issue in this scenario is core developer response time.
## Core developers There is a finite amount of time that core developers get to contribute to Python and it fluctuates greatly. This means that if a process can be found which allows core developers to spend less time doing mechanical work and more time doing things that can't be automated -- namely code reviews -- then the throughput of patches being accepted/rejected will increase. This also impacts any increased patch submission rate that comes from improving the situation for contributors because if the throughput doesn't change then there will simply be more patches sitting in the issue tracker and that doesn't benefit anyone.
This is the key concern I have with only addressing the contributor side of things. I'm all for increasing contributions, but not if they are just going to rot on the tracker and we end up with disillusioned contributors.
# My ideal scenario If I had an infinite amount of resources (money, volunteers, time, etc.), this would be my ideal scenario:
1. Contributor gets code from wherever; easiest to just say "fork on GitHub or Bitbucket" as they would be official mirrors of hg.python.org and are updated after every commit, but could clone hg.python.org/cpython if they wanted 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then they have browser edit access already 3. Contributor creates an account at bugs.python.org and signs the CLA
There's no real way around this, is there? I suppose account creation *could* be automated relative to a github or bitbucket user, though it probably isn't worth the effort. However, the CLA part is pretty unavoidable.
3. The contributor creates an issue at bugs.python.org (probably the one piece of infrastructure we all agree is better than the other options, although its workflow could use an update)
I wonder if issue creation from a PR (where no issue # is in the message) could be automated too without a lot of extra work.
4. If the contributor used Bitbucket or GitHub, they send a pull request with the issue # in the PR message 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on bugs.python.org for code review 6. CI runs on the patch based on what Python versions are specified in the issue tracker, letting everyone know if it applied cleanly, passed tests on the OSs that would be affected, and also got a test coverage report 7. Core developer does a code review 8. Contributor updates their code based on the code review and the updated patch gets pulled by bugs.python.org automatically and CI runs again 9. Once the patch is acceptable and assuming the patch applies cleanly to all versions to commit to, the core developer clicks a "Commit" button, fills in a commit message and NEWS entry, and everything gets committed (if the patch can't apply cleanly then the core developer does it the old-fashion way, or maybe auto-generate a new PR which can be manually touched up so it does apply cleanly?)
6-9 sounds a lot like PEP 462. :) This seems like the part the would win us the most.
Basically the ideal scenario lets contributors use whatever tools and platforms that they want and provides as much automated support as possible to make sure their code is tip-top before and during code review while core developers can review and commit patches so easily that they can do their job from a beach with a tablet and some WiFi.
Sign me up!
## Where the current proposed solutions seem to fall short ### GitHub/Bitbucket Basically GitHub/Bitbucket is a win for contributors but doesn't buy core developers that much. GitHub/Bitbucket gives contributors the easy cloning, drive-by patches, CI, and PRs. Core developers get a code review tool -- I'm counting Rietveld as deprecated after Guido's comments about the code's maintenance issues -- and push-button commits **only for single branch changes**. But for any patch that crosses branches we don't really gain anything. At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push; that still keeps me off the beach, though, so that doesn't get us the whole way.
This will probably be one of the trickiest parts.
You could force people to submit two PRs, but I don't see that flying. Maybe some tool could be written that automatically handles the merge/commit across branches once the initial PR is in? Or automatically create a PR that core developers can touch up as necessary and then accept that as well? Regardless, some solution is necessary to handle branch-crossing PRs.
As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's interface more, but that's personal taste. I like hg more than git, but that's also personal taste (and I consider a transition from hg to git a hassle but not a deal-breaker but also not a win). It is unfortunate, though, that under this scenario we would have to choose only one platform.
It's also unfortunate both are closed-source, but that's not a deal-breaker, just a knock against if the decision is close.
### Our own infrastructure The shortcoming here is the need for developers, developers, developers! Everything outlined in the ideal scenario is totally doable on our own infrastructure with enough code and time (donated/paid-for infrastructure shouldn't be an issue). But historically that code and time has not materialized. Our code review tool is a fork that probably should be replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti maintains the issue tracker's code.
Doing something about those two tools is something to consider. Would it be out of scope for this discussion or any resulting PEPS? I have opinions here, but I'd rather not sidetrack the discussion.
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
Sounds reasonable to me, but I don't have plans to champion a PEP. :) I could probably help with the tooling between GitHub/Bitbucket though. -eric
Hi All I just want to put my two cents into this. This would definitely be a great step to take. I have been discussing PEP 462 with Nick, and the automation was definitely something that would be something that would be great to have - I mean, I was submitting a simple documentation patch for build CPython on Windows, and it took several weeks for the patch to be accepted, then a couple of months for the patch to actually be merged in. As mentioned, automated testing to ensure that tests pass, along with easier comitting of documentation patches, would obviously be a great way to start to decrease this turn around. Has there been any though on what sort off infrastructure we could use for this? Obviously github / bitbucket could be used as mentioned by others for repo management, but a lot of thought would have to go into the decisions regarding CI tools. I think it would also be a good time to address the issues with the current bug tracker - although it works, it is hardly as useable as some of the other ones. As for the argument that we should use open source tools to ensure that the owners of these tools aren't able to cause us problems in the future - both Hadoop and Cassandra, along with a lot of other Apache projects seem to be using JIRA just fine. Thanks Shorya Raj On Sat, Dec 6, 2014 at 11:17 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Very nice, Brett.
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
And we can't forget the people who help keep all of this running as well. There are those that manage the SSH keys, the issue tracker, the review tool, hg.python.org, and the email system that let's use know when stuff happens on any of these other systems. The impact on them needs to also be considered.
It sounds like Guido would rather as much of this was done by a provider rather than relying on volunteers. That makes sense though there are concerns about control of certain assents. However, that applies only to some, like hg.python.org.
## Contributors I see two scenarios for contributors to optimize for. There's the simple spelling mistake patches and then there's the code change patches. The former is the kind of thing that you can do in a browser without much
and should be a no-brainer commit/reject decision for a core developer. This is what the GitHub/Bitbucket camps have been promoting their solution for solving while leaving the cpython repo alone. Unfortunately the bulk of our documentation is in the Doc/ directory of cpython. While it's nice to
effort think
about moving the devguide, peps, and even breaking out the tutorial to repos hosting on Bitbucket/GitHub, everything else is in Doc/ (language reference, howtos, stdlib, C API, etc.). So unless we want to completely break all of Doc/ out of the cpython repo and have core developers willing to edit two separate repos when making changes that impact code **and** docs, moving only a subset of docs feels like a band-aid solution that ignores the big, white elephant in the room: the cpython repo, where a bulk of patches are targeting.
With your ideal scenario this would be a moot point, right? There would be no need to split out doc-related repos.
For the code change patches, contributors need an easy way to get a hold
of
the code and get their changes to the core developers. After that it's things like letting contributors knowing that their patch doesn't apply cleanly, doesn't pass tests, etc.
This is probably more work than it seems at first.
As of right now getting the patch into the issue tracker is a bit manual but nothing crazy. The real issue in this scenario is core developer response time.
## Core developers There is a finite amount of time that core developers get to contribute to Python and it fluctuates greatly. This means that if a process can be found which allows core developers to spend less time doing mechanical work and more time doing things that can't be automated -- namely code reviews -- then the throughput of patches being accepted/rejected will increase. This also impacts any increased patch submission rate that comes from improving the situation for contributors because if the throughput doesn't change then there will simply be more patches sitting in the issue tracker and that doesn't benefit anyone.
This is the key concern I have with only addressing the contributor side of things. I'm all for increasing contributions, but not if they are just going to rot on the tracker and we end up with disillusioned contributors.
# My ideal scenario If I had an infinite amount of resources (money, volunteers, time, etc.), this would be my ideal scenario:
1. Contributor gets code from wherever; easiest to just say "fork on
or Bitbucket" as they would be official mirrors of hg.python.org and are updated after every commit, but could clone hg.python.org/cpython if
wanted 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then
GitHub they they
have browser edit access already 3. Contributor creates an account at bugs.python.org and signs the CLA
There's no real way around this, is there? I suppose account creation *could* be automated relative to a github or bitbucket user, though it probably isn't worth the effort. However, the CLA part is pretty unavoidable.
3. The contributor creates an issue at bugs.python.org (probably the one piece of infrastructure we all agree is better than the other options, although its workflow could use an update)
I wonder if issue creation from a PR (where no issue # is in the message) could be automated too without a lot of extra work.
4. If the contributor used Bitbucket or GitHub, they send a pull request with the issue # in the PR message 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on bugs.python.org for code review 6. CI runs on the patch based on what Python versions are specified in the issue tracker, letting everyone know if it applied cleanly, passed tests on the OSs that would be affected, and also got a test coverage report 7. Core developer does a code review 8. Contributor updates their code based on the code review and the updated patch gets pulled by bugs.python.org automatically and CI runs again 9. Once the patch is acceptable and assuming the patch applies cleanly to all versions to commit to, the core developer clicks a "Commit" button, fills in a commit message and NEWS entry, and everything gets committed (if the patch can't apply cleanly then the core developer does it the old-fashion way, or maybe auto-generate a new PR which can be manually touched up so it does apply cleanly?)
6-9 sounds a lot like PEP 462. :) This seems like the part the would win us the most.
Basically the ideal scenario lets contributors use whatever tools and platforms that they want and provides as much automated support as
possible
to make sure their code is tip-top before and during code review while core developers can review and commit patches so easily that they can do their job from a beach with a tablet and some WiFi.
Sign me up!
## Where the current proposed solutions seem to fall short ### GitHub/Bitbucket Basically GitHub/Bitbucket is a win for contributors but doesn't buy core developers that much. GitHub/Bitbucket gives contributors the easy
cloning,
drive-by patches, CI, and PRs. Core developers get a code review tool -- I'm counting Rietveld as deprecated after Guido's comments about the code's maintenance issues -- and push-button commits **only for single branch changes**. But for any patch that crosses branches we don't really gain anything. At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push; that still keeps me off the beach, though, so that doesn't get us the whole way.
This will probably be one of the trickiest parts.
You could force people to submit two PRs, but I don't see that flying. Maybe some tool could be written that automatically handles the merge/commit across branches once the initial PR is in? Or automatically create a PR that core developers can touch up as necessary and then accept that as well? Regardless, some solution is necessary to handle branch-crossing PRs.
As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's interface more, but that's personal taste. I like hg more than git, but that's also personal taste (and I consider a transition from hg to git a hassle but not a deal-breaker but also not a win). It is unfortunate, though, that under this scenario we would have to choose only one platform.
It's also unfortunate both are closed-source, but that's not a deal-breaker, just a knock against if the decision is close.
### Our own infrastructure The shortcoming here is the need for developers, developers, developers! Everything outlined in the ideal scenario is totally doable on our own infrastructure with enough code and time (donated/paid-for infrastructure shouldn't be an issue). But historically that code and time has not materialized. Our code review tool is a fork that probably should be replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti maintains the issue tracker's code.
Doing something about those two tools is something to consider. Would it be out of scope for this discussion or any resulting PEPS? I have opinions here, but I'd rather not sidetrack the discussion.
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
Sounds reasonable to me, but I don't have plans to champion a PEP. :) I could probably help with the tooling between GitHub/Bitbucket though.
-eric _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/rajshorya%40gmail.com
On Dec 5, 2014 4:18 PM, "Eric Snow" <ericsnowcurrently@gmail.com> wrote:
Very nice, Brett.
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
And we can't forget the people who help keep all of this running as
There are those that manage the SSH keys, the issue tracker, the review tool, hg.python.org, and the email system that let's use know when stuff happens on any of these other systems. The impact on them needs to also be considered.
It sounds like Guido would rather as much of this was done by a provider rather than relying on volunteers. That makes sense though there are concerns about control of certain assents. However, that applies only to some, like hg.python.org.
## Contributors I see two scenarios for contributors to optimize for. There's the simple spelling mistake patches and then there's the code change patches. The former is the kind of thing that you can do in a browser without much
effort
and should be a no-brainer commit/reject decision for a core developer. This is what the GitHub/Bitbucket camps have been promoting their solution for solving while leaving the cpython repo alone. Unfortunately the bulk of our documentation is in the Doc/ directory of cpython. While it's nice to
about moving the devguide, peps, and even breaking out the tutorial to repos hosting on Bitbucket/GitHub, everything else is in Doc/ (language reference, howtos, stdlib, C API, etc.). So unless we want to completely break all of Doc/ out of the cpython repo and have core developers willing to edit two separate repos when making changes that impact code **and** docs, moving only a subset of docs feels like a band-aid solution that ignores the big, white elephant in the room: the cpython repo, where a bulk of patches are targeting.
With your ideal scenario this would be a moot point, right? There would be no need to split out doc-related repos.
For the code change patches, contributors need an easy way to get a
hold of
the code and get their changes to the core developers. After that it's things like letting contributors knowing that their patch doesn't apply cleanly, doesn't pass tests, etc.
This is probably more work than it seems at first.
As of right now getting the patch into the issue tracker is a bit manual but nothing crazy. The real issue in this scenario is core developer response time.
## Core developers There is a finite amount of time that core developers get to contribute to Python and it fluctuates greatly. This means that if a process can be found which allows core developers to spend less time doing mechanical work and more time doing things that can't be automated -- namely code reviews -- then the throughput of patches being accepted/rejected will increase. This also impacts any increased patch submission rate that comes from improving the situation for contributors because if the throughput doesn't change
there will simply be more patches sitting in the issue tracker and that doesn't benefit anyone.
This is the key concern I have with only addressing the contributor side of things. I'm all for increasing contributions, but not if they are just going to rot on the tracker and we end up with disillusioned contributors.
# My ideal scenario If I had an infinite amount of resources (money, volunteers, time,
etc.),
this would be my ideal scenario:
1. Contributor gets code from wherever; easiest to just say "fork on GitHub or Bitbucket" as they would be official mirrors of hg.python.org and are updated after every commit, but could clone hg.python.org/cpython if
wanted 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then
have browser edit access already 3. Contributor creates an account at bugs.python.org and signs the CLA
There's no real way around this, is there? I suppose account creation *could* be automated relative to a github or bitbucket user, though it probably isn't worth the effort. However, the CLA part is pretty unavoidable.
3. The contributor creates an issue at bugs.python.org (probably the one piece of infrastructure we all agree is better than the other options, although its workflow could use an update)
I wonder if issue creation from a PR (where no issue # is in the message) could be automated too without a lot of extra work.
4. If the contributor used Bitbucket or GitHub, they send a pull request with the issue # in the PR message 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on bugs.python.org for code review 6. CI runs on the patch based on what Python versions are specified in
issue tracker, letting everyone know if it applied cleanly, passed tests on the OSs that would be affected, and also got a test coverage report 7. Core developer does a code review 8. Contributor updates their code based on the code review and the updated patch gets pulled by bugs.python.org automatically and CI runs again 9. Once the patch is acceptable and assuming the patch applies cleanly to all versions to commit to, the core developer clicks a "Commit" button, fills in a commit message and NEWS entry, and everything gets committed (if the patch can't apply cleanly then the core developer does it the old-fashion way, or maybe auto-generate a new PR which can be manually touched up so it does apply cleanly?)
6-9 sounds a lot like PEP 462. :) This seems like the part the would win us the most.
Basically the ideal scenario lets contributors use whatever tools and platforms that they want and provides as much automated support as
to make sure their code is tip-top before and during code review while core developers can review and commit patches so easily that they can do
job from a beach with a tablet and some WiFi.
Sign me up!
## Where the current proposed solutions seem to fall short ### GitHub/Bitbucket Basically GitHub/Bitbucket is a win for contributors but doesn't buy
core
developers that much. GitHub/Bitbucket gives contributors the easy cloning, drive-by patches, CI, and PRs. Core developers get a code review tool -- I'm counting Rietveld as deprecated after Guido's comments about the code's maintenance issues -- and push-button commits **only for single branch changes**. But for any patch that crosses branches we don't really gain anything. At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push; that still keeps me off the beach, though, so that doesn't get us the whole way.
This will probably be one of the trickiest parts.
You could force people to submit two PRs, but I don't see that flying. Maybe some tool could be written that automatically handles the merge/commit across branches once the initial PR is in? Or automatically create a PR that core developers can touch up as necessary and then accept that as well? Regardless, some solution is necessary to handle branch-crossing PRs.
As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's interface more, but that's personal taste. I like hg more than git, but that's also personal taste (and I consider a transition from hg to git a hassle but not a deal-breaker but also not a win). It is unfortunate, though, that under this scenario we would have to choose only one
well. think then they they the possible their platform.
It's also unfortunate both are closed-source, but that's not a
deal-breaker,
just a knock against if the decision is close.
### Our own infrastructure The shortcoming here is the need for developers, developers, developers! Everything outlined in the ideal scenario is totally doable on our own infrastructure with enough code and time (donated/paid-for infrastructure shouldn't be an issue). But historically that code and time has not materialized. Our code review tool is a fork that probably should be replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti maintains the issue tracker's code.
Doing something about those two tools is something to consider. Would it be out of scope for this discussion or any resulting PEPS? I have opinions here, but I'd rather not sidetrack the discussion.
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8
weeks
away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
Sounds reasonable to me, but I don't have plans to champion a PEP. :) I could probably help with the tooling between GitHub/Bitbucket though.
-eric _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/graffatcolmingov%40gmail....
I have extensive experience with the GitHub API and some with BitBucket. I'm willing to help out with the tooling as well.
On Fri, 05 Dec 2014 15:17:35 -0700, Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
The biggest blocker to my actually working the proposal I made was that people wanted to see it in action first, which means I needed to spin up a test instance of the tracker and do the work there. That barrier to getting started was enough to keep me from getting started...even though the barrier isn't *that* high (I've done it before, and it is easier now than it was when I first did it), it is still a *lot* higher than checking out CPython and working on a patch. That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch. Maybe our infrastructure folks will have a thought or two about this? I'm willing to put some work into this if we can figure out what direction to head in. It could well be tied in to moving bugs.python.org in with the rest of our infrastructure, something I know Donald has been noodling with off and on; and I'm willing to help with that as well. It sounds like being able to propose and test changes to our Roundup instance (and test other services talking to Roundup, before deploying them for real) is going to be critical to improving our workflow no matter what other decisions are made, so we need to make it easier to do. In other words, it seems like the key to improving the productivity of our CPython patch workflow is to improve the productivity of the patch workflow for our key workflow resource, bugs.python.org. --David
On Dec 5, 2014, at 8:26 PM, R. David Murray <rdmurray@bitdance.com> wrote:
On Fri, 05 Dec 2014 15:17:35 -0700, Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
The biggest blocker to my actually working the proposal I made was that people wanted to see it in action first, which means I needed to spin up a test instance of the tracker and do the work there. That barrier to getting started was enough to keep me from getting started...even though the barrier isn't *that* high (I've done it before, and it is easier now than it was when I first did it), it is still a *lot* higher than checking out CPython and working on a patch.
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe our infrastructure folks will have a thought or two about this? I'm willing to put some work into this if we can figure out what direction to head in. It could well be tied in to moving bugs.python.org in with the rest of our infrastructure, something I know Donald has been noodling with off and on; and I'm willing to help with that as well.
Theoretically you could create a dev environment with the psf-salt stuff once it’s actually done. It won’t be the most efficient use of your computer resources because it’d expect to run several vagrant VMs locally but it would also match “production” (in a salt-ified world) better. It wouldn’t be as good as a dedicated dev setup for it, but it would probably be better than a sort of “yea here’s a bunch of steps that sort of get you close YOLO”.
It sounds like being able to propose and test changes to our Roundup instance (and test other services talking to Roundup, before deploying them for real) is going to be critical to improving our workflow no matter what other decisions are made, so we need to make it easier to do.
In other words, it seems like the key to improving the productivity of our CPython patch workflow is to improve the productivity of the patch workflow for our key workflow resource, bugs.python.org.
--David _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 6 December 2014 at 11:39, Donald Stufft <donald@stufft.io> wrote:
Maybe our infrastructure folks will have a thought or two about this? I'm willing to put some work into this if we can figure out what direction to head in. It could well be tied in to moving bugs.python.org in with the rest of our infrastructure, something I know Donald has been noodling with off and on; and I'm willing to help with that as well.
Theoretically you could create a dev environment with the psf-salt stuff once it’s actually done. It won’t be the most efficient use of your computer resources because it’d expect to run several vagrant VMs locally but it would also match “production” (in a salt-ified world) better. It wouldn’t be as good as a dedicated dev setup for it, but it would probably be better than a sort of “yea here’s a bunch of steps that sort of get you close YOLO”.
For demonstrating UI changes, either a single VM Vagrant setup specifically for testing, or else something that works in the free tier of a public PaaS may be a better option. The advantage of those two approaches is that they'd be potentially acceptable as contributions to the upstream Roundup project, rather than needing to be CPython specific. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray <rdmurray@bitdance.com> wrote:
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but
On Fri, 05 Dec 2014 15:17:35 -0700, Eric Snow <ericsnowcurrently@gmail.com> wrote: then ran
out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever.
People
would also have to commit to continued maintenance to make this viable long-term.
The biggest blocker to my actually working the proposal I made was that people wanted to see it in action first, which means I needed to spin up a test instance of the tracker and do the work there. That barrier to getting started was enough to keep me from getting started...even though the barrier isn't *that* high (I've done it before, and it is easier now than it was when I first did it), it is still a *lot* higher than checking out CPython and working on a patch.
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over the past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
Maybe our infrastructure folks will have a thought or two about this? I'm willing to put some work into this if we can figure out what direction to head in. It could well be tied in to moving bugs.python.org in with the rest of our infrastructure, something I know Donald has been noodling with off and on; and I'm willing to help with that as well.
It sounds like being able to propose and test changes to our Roundup instance (and test other services talking to Roundup, before deploying them for real) is going to be critical to improving our workflow no matter what other decisions are made, so we need to make it easier to do.
In other words, it seems like the key to improving the productivity of our CPython patch workflow is to improve the productivity of the patch workflow for our key workflow resource, bugs.python.org.
Quite possible and since no one is suggesting we drop bugs.python.org it's a worthy goal to have regardless of what PEP gets accepted.
On Dec 6, 2014, at 9:11 AM, Brett Cannon <brett@python.org> wrote:
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray <rdmurray@bitdance.com <mailto:rdmurray@bitdance.com>> wrote: On Fri, 05 Dec 2014 15:17:35 -0700, Eric Snow <ericsnowcurrently@gmail.com <mailto:ericsnowcurrently@gmail.com>> wrote:
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com <mailto:bcannon@gmail.com>> wrote:
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
The biggest blocker to my actually working the proposal I made was that people wanted to see it in action first, which means I needed to spin up a test instance of the tracker and do the work there. That barrier to getting started was enough to keep me from getting started...even though the barrier isn't *that* high (I've done it before, and it is easier now than it was when I first did it), it is still a *lot* higher than checking out CPython and working on a patch.
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org <http://bugs.python.org/>, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over the past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
Heh, one of my thoughts on deploying the bug tracker into production was via a container, especially since we have multiple instances of it. I got side tracked on getting the rest of the infrastructure readier for a web application and some improvements there as well as getting a big postgresql database cluster set up (2x 15GB RAM servers running in Primary/Replica mode). The downside of course to this is that afaik Docker is a lot harder to use on Windows and to some degree OS X than linux. However if the tracker could be deployed as a docker image that would make the infrastructure side a ton easier. I also have control over the python/ organization on Docker Hub too for whatever uses we have for it. Unrelated to the tracker: Something that any PEP should consider is security, particularly that of running the tests. Currently we have a buildbot fleet that checks out the code and executes the test suite (aka code). A problem that any pre-merge test runner needs to solve is that unlike a post-merge runner, which will only run code that has been committed by a committer, a pre-merge runner will run code that _anybody_ has submitted. This means that it’s not merely enough to simply trigger a build in our buildbot fleet prior to the merge happening as that would allow anyone to execute arbitrary code there. As far as I’m aware there are two solutions to this problem in common use, either use throw away environments/machines/containers that isolate the running code and then get destroyed after each test run, or don’t run the pre-merge tests immediately unless it’s from a “trusted” person and for “untrusted” or “unknown” people require a “trusted” person to give the OK for each test run. The throw away machine solution is obviously much nicer experience for the “untrusted” or “unknown” users since they don’t require any intervention to get their tests run which means that they can see if their tests pass, fix things, and then see if that fixes it much quicker. The obvious downside here is that it’s more effort to do that and the availability of throw away environments for all the systems we support. Linux, most (all?) of the BSDs, and Windows are pretty easy here since there are cloud offerings for them that can be used to spin up a temporary environment, run tests, and then delete it. OS X is a problem because afaik you can only virtualize OS X on Apple hardware and I’m not aware of any cloud provider that offers metered access to OS X hosts. The more esoteric systems like AIX and what not are likely an even bigger problem in this regard since I’m unsure of the ability to get virtualized instances of these at all. It may be possible to build our own images of these on a cloud provider assuming that their licenses allow that. The other solution would work easier with our current buildbot fleet since you’d just tell it to run some tests but you’d wait until a “trusted” person gave the OK before you did that. A likely solution is to use a pre-merge test runner for the systems that we can isolate which will give a decent indication if the tests are going to pass across the entire supported matrix or not and then continue to use the current post-merge test runner to handle testing the esoteric systems that we can’t work into the pre-merge testing. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Sat Dec 06 2014 at 10:07:50 AM Donald Stufft <donald@stufft.io> wrote:
On Dec 6, 2014, at 9:11 AM, Brett Cannon <brett@python.org> wrote:
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray <rdmurray@bitdance.com> wrote:
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but
On Fri, 05 Dec 2014 15:17:35 -0700, Eric Snow < ericsnowcurrently@gmail.com> wrote: then ran
out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a
certain
deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
The biggest blocker to my actually working the proposal I made was that people wanted to see it in action first, which means I needed to spin up a test instance of the tracker and do the work there. That barrier to getting started was enough to keep me from getting started...even though the barrier isn't *that* high (I've done it before, and it is easier now than it was when I first did it), it is still a *lot* higher than checking out CPython and working on a patch.
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over the past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
Heh, one of my thoughts on deploying the bug tracker into production was via a container, especially since we have multiple instances of it. I got side tracked on getting the rest of the infrastructure readier for a web application and some improvements there as well as getting a big postgresql database cluster set up (2x 15GB RAM servers running in Primary/Replica mode). The downside of course to this is that afaik Docker is a lot harder to use on Windows and to some degree OS X than linux. However if the tracker could be deployed as a docker image that would make the infrastructure side a ton easier. I also have control over the python/ organization on Docker Hub too for whatever uses we have for it.
I think it's something worth thinking about, but like you I don't know if the containers work on OS X or Windows (I don't work with containers personally).
Unrelated to the tracker:
Something that any PEP should consider is security, particularly that of running the tests. Currently we have a buildbot fleet that checks out the code and executes the test suite (aka code). A problem that any pre-merge test runner needs to solve is that unlike a post-merge runner, which will only run code that has been committed by a committer, a pre-merge runner will run code that _anybody_ has submitted. This means that it’s not merely enough to simply trigger a build in our buildbot fleet prior to the merge happening as that would allow anyone to execute arbitrary code there. As far as I’m aware there are two solutions to this problem in common use, either use throw away environments/machines/containers that isolate the running code and then get destroyed after each test run, or don’t run the pre-merge tests immediately unless it’s from a “trusted” person and for “untrusted” or “unknown” people require a “trusted” person to give the OK for each test run.
The throw away machine solution is obviously much nicer experience for the “untrusted” or “unknown” users since they don’t require any intervention to get their tests run which means that they can see if their tests pass, fix things, and then see if that fixes it much quicker. The obvious downside here is that it’s more effort to do that and the availability of throw away environments for all the systems we support. Linux, most (all?) of the BSDs, and Windows are pretty easy here since there are cloud offerings for them that can be used to spin up a temporary environment, run tests, and then delete it. OS X is a problem because afaik you can only virtualize OS X on Apple hardware and I’m not aware of any cloud provider that offers metered access to OS X hosts. The more esoteric systems like AIX and what not are likely an even bigger problem in this regard since I’m unsure of the ability to get virtualized instances of these at all. It may be possible to build our own images of these on a cloud provider assuming that their licenses allow that.
The other solution would work easier with our current buildbot fleet since you’d just tell it to run some tests but you’d wait until a “trusted” person gave the OK before you did that.
A likely solution is to use a pre-merge test runner for the systems that we can isolate which will give a decent indication if the tests are going to pass across the entire supported matrix or not and then continue to use the current post-merge test runner to handle testing the esoteric systems that we can’t work into the pre-merge testing.
Security is definitely something to consider and what you mentioned above is all reasonable for CI of submitted patches. This all also a reason to consider CI services like Travis, Codeship, Drone, etc. as they are already set up for this kind of thing and simply using them for the pre-commit checks and then relying on the buildbots for post-commit verification we didn't break on some specific platform.
On Sat, 06 Dec 2014 15:21:46 +0000, Brett Cannon <brett@python.org> wrote:
On Sat Dec 06 2014 at 10:07:50 AM Donald Stufft <donald@stufft.io> wrote:
On Dec 6, 2014, at 9:11 AM, Brett Cannon <brett@python.org> wrote:
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray <rdmurray@bitdance.com> wrote:
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over the past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
Heh, one of my thoughts on deploying the bug tracker into production was via a container, especially since we have multiple instances of it. I got side tracked on getting the rest of the infrastructure readier for a web application and some improvements there as well as getting a big postgresql database cluster set up (2x 15GB RAM servers running in Primary/Replica mode). The downside of course to this is that afaik Docker is a lot harder to use on Windows and to some degree OS X than linux. However if the tracker could be deployed as a docker image that would make the infrastructure side a ton easier. I also have control over the python/ organization on Docker Hub too for whatever uses we have for it.
I think it's something worth thinking about, but like you I don't know if the containers work on OS X or Windows (I don't work with containers personally).
(Had to fix the quoting there, somebody's email program got it wrong.) For the tracker, being unable to run a test instance on Windows would likely not be a severe limitation. Given how few Windows people we get making contributions to CPython, I'd really rather encourage them to work there, rather than on the tracker. OS/X is a bit more problematic, but it sounds like it is also a bit more doable. On the other hand, what's the overhead on setting up to use Docker? If that task is non-trivial, we're back to having a higher barrier to entry than running a dev-in-a-box script... Note also in thinking about setting up a test tracker instance we have an additional concern: it requires postgres, and needs either a copy of the full data set (which includes account data/passwords which would need to be creatively sanitized) or a fairly large test data set. I'd prefer a sanitized copy of the real data. --David
On 7 December 2014 at 02:11, R. David Murray <rdmurray@bitdance.com> wrote:
For the tracker, being unable to run a test instance on Windows would likely not be a severe limitation. Given how few Windows people we get making contributions to CPython, I'd really rather encourage them to work there, rather than on the tracker. OS/X is a bit more problematic, but it sounds like it is also a bit more doable.
On the other hand, what's the overhead on setting up to use Docker? If that task is non-trivial, we're back to having a higher barrier to entry than running a dev-in-a-box script...
Note also in thinking about setting up a test tracker instance we have an additional concern: it requires postgres, and needs either a copy of the full data set (which includes account data/passwords which would need to be creatively sanitized) or a fairly large test data set. I'd prefer a sanitized copy of the real data.
If you're OK with git as an entry requirement, then something like the OpenShift free tier may be a better place for test instances, rather than local hosting - with an appropriate quickstart, creating your own tracker instance can be a single click operation on a normal hyperlink. That also has the advantage of making it easy to share changes to demonstrate UI updates. (OpenShift doesn't support running containers directly yet, but that capability is being worked on in the upstream OpenShift Origin open source project) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Dec 6, 2014 at 10:11 AM, R. David Murray <rdmurray@bitdance.com> wrote:
On Sat Dec 06 2014 at 10:07:50 AM Donald Stufft <donald@stufft.io> wrote:
On Dec 6, 2014, at 9:11 AM, Brett Cannon <brett@python.org> wrote:
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray < rdmurray@bitdance.com> wrote:
That's probably the biggest issue with *anyone* contributing to
maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over the past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
Heh, one of my thoughts on deploying the bug tracker into production was via a container, especially since we have multiple instances of it. I got side tracked on getting the rest of the infrastructure readier for a web application and some improvements there as well as getting a big
On Sat, 06 Dec 2014 15:21:46 +0000, Brett Cannon <brett@python.org> wrote: tracker postgresql
database cluster set up (2x 15GB RAM servers running in Primary/Replica mode). The downside of course to this is that afaik Docker is a lot harder to use on Windows and to some degree OS X than linux. However if the tracker could be deployed as a docker image that would make the infrastructure side a ton easier. I also have control over the python/ organization on Docker Hub too for whatever uses we have for it.
I think it's something worth thinking about, but like you I don't know if the containers work on OS X or Windows (I don't work with containers personally).
(Had to fix the quoting there, somebody's email program got it wrong.)
For the tracker, being unable to run a test instance on Windows would likely not be a severe limitation. Given how few Windows people we get making contributions to CPython, I'd really rather encourage them to work there, rather than on the tracker. OS/X is a bit more problematic, but it sounds like it is also a bit more doable.
On the other hand, what's the overhead on setting up to use Docker? If that task is non-trivial, we're back to having a higher barrier to entry than running a dev-in-a-box script...
Note also in thinking about setting up a test tracker instance we have an additional concern: it requires postgres, and needs either a copy of the full data set (which includes account data/passwords which would need to be creatively sanitized) or a fairly large test data set. I'd prefer a sanitized copy of the real data.
FactoryBoy would make generating issue tracker test fixtures fairly simple: http://factoryboy.readthedocs.org/en/latest/introduction.html#lazyattribute There are probably lots of instances of free-form usernames in issue tickets; which some people may or may not be comfortable with, considering that the data is and has always been public.
On 7 December 2014 at 01:07, Donald Stufft <donald@stufft.io> wrote:
A likely solution is to use a pre-merge test runner for the systems that we can isolate which will give a decent indication if the tests are going to pass across the entire supported matrix or not and then continue to use the current post-merge test runner to handle testing the esoteric systems that we can’t work into the pre-merge testing.
Yep, that's exactly the approach I had in mind for this problem. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Dec 6, 2014, at 10:26 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 7 December 2014 at 01:07, Donald Stufft <donald@stufft.io> wrote:
A likely solution is to use a pre-merge test runner for the systems that we can isolate which will give a decent indication if the tests are going to pass across the entire supported matrix or not and then continue to use the current post-merge test runner to handle testing the esoteric systems that we can’t work into the pre-merge testing.
Yep, that's exactly the approach I had in mind for this problem.
I’m coming around to the idea for pip too, though I’ve been trying to figure out a way to do pre-merge testing using isolated for even the esoteric platforms. One thing that I’d personally greatly appreciate is if this whole process made it possible for selected external projects to re-use the infrastructure for the harder to get platforms. Pip and setuptools in particular would make good candidates for this I think. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 12/6/2014 10:26 AM, Nick Coghlan wrote:
On 7 December 2014 at 01:07, Donald Stufft <donald@stufft.io> wrote:
A likely solution is to use a pre-merge test runner for the systems that we can isolate which will give a decent indication if the tests are going to pass across the entire supported matrix or not and then continue to use the current post-merge test runner to handle testing the esoteric systems that we can’t work into the pre-merge testing.
Yep, that's exactly the approach I had in mind for this problem.
Most patches are tested on just one (major) system before being committed. The buildbots confirm that there is no oddball failure elsewhere, and there is usually is not. Testing user submissions on one system should usually be enough. Committers should generally have an idea when wider testing is needed, and indeed it should be nice to be able to get wider testing on occasion *before* making a commit, without begging on the tracker. What would be *REALLY* helpful for Idle development (and tkinter, turtle, and turtle demo testing) would be if there were a test.support.screenshot function that would take a screenshot and email to the tracker or developer. There would also need to be at least one (stable) *nix test machine that actually runs tkinter code, and the ability to test on OSX with its different graphics options. Properly testing Idle tkinter code that affects what users see is a real bottleneck. -- Terry Jan Reedy
On Sat, Dec 6, 2014 at 9:07 AM, Donald Stufft <donald@stufft.io> wrote:
Heh, one of my thoughts on deploying the bug tracker into production was via a container, especially since we have multiple instances of it. I got side tracked on getting the rest of the infrastructure readier for a web application and some improvements there as well as getting a big postgresql database cluster set up (2x 15GB RAM servers running in Primary/Replica mode). The downside of course to this is that afaik Docker is a lot harder to use on Windows and to some degree OS X than linux. However if the tracker could be deployed as a docker image that would make the infrastructure side a ton easier. I also have control over the python/ organization on Docker Hub too for whatever uses we have for it.
Are you referring to https://registry.hub.docker.com/repos/python/ ? IPython / Jupyter have some useful Docker images: * https://registry.hub.docker.com/repos/ipython/ * https://registry.hub.docker.com/repos/jupyter/ CI integration with roundup seems to be the major gap here: * https://wiki.jenkins-ci.org/display/JENKINS/Docker+Plugin * https://wiki.jenkins-ci.org/display/JENKINS/saltstack-plugin * https://github.com/saltstack-formulas/docker-formula
Unrelated to the tracker:
Something that any PEP should consider is security, particularly that of running the tests. Currently we have a buildbot fleet that checks out the code and executes the test suite (aka code). A problem that any pre-merge test runner needs to solve is that unlike a post-merge runner, which will only run code that has been committed by a committer, a pre-merge runner will run code that _anybody_ has submitted. This means that it’s not merely enough to simply trigger a build in our buildbot fleet prior to the merge happening as that would allow anyone to execute arbitrary code there. As far as I’m aware there are two solutions to this problem in common use, either use throw away environments/machines/containers that isolate the running code and then get destroyed after each test run, or don’t run the pre-merge tests immediately unless it’s from a “trusted” person and for “untrusted” or “unknown” people require a “trusted” person to give the OK for each test run.
The throw away machine solution is obviously much nicer experience for the “untrusted” or “unknown” users since they don’t require any intervention to get their tests run which means that they can see if their tests pass, fix things, and then see if that fixes it much quicker. The obvious downside here is that it’s more effort to do that and the availability of throw away environments for all the systems we support. Linux, most (all?) of the BSDs, and Windows are pretty easy here since there are cloud offerings for them that can be used to spin up a temporary environment, run tests, and then delete it. OS X is a problem because afaik you can only virtualize OS X on Apple hardware and I’m not aware of any cloud provider that offers metered access to OS X hosts. The more esoteric systems like AIX and what not are likely an even bigger problem in this regard since I’m unsure of the ability to get virtualized instances of these at all. It may be possible to build our own images of these on a cloud provider assuming that their licenses allow that.
The other solution would work easier with our current buildbot fleet since you’d just tell it to run some tests but you’d wait until a “trusted” person gave the OK before you did that.
A likely solution is to use a pre-merge test runner for the systems that we can isolate which will give a decent indication if the tests are going to pass across the entire supported matrix or not and then continue to use the current post-merge test runner to handle testing the esoteric systems that we can’t work into the pre-merge testing.
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com
On Sat, Dec 6, 2014 at 7:27 PM, Wes Turner <wes.turner@gmail.com> wrote:
On Sat, Dec 6, 2014 at 9:07 AM, Donald Stufft <donald@stufft.io> wrote:
Heh, one of my thoughts on deploying the bug tracker into production was via a container, especially since we have multiple instances of it. I got side tracked on getting the rest of the infrastructure readier for a web application and some improvements there as well as getting a big postgresql database cluster set up (2x 15GB RAM servers running in Primary/Replica mode). The downside of course to this is that afaik Docker is a lot harder to use on Windows and to some degree OS X than linux. However if the tracker could be deployed as a docker image that would make the infrastructure side a ton easier. I also have control over the python/ organization on Docker Hub too for whatever uses we have for it.
Are you referring to https://registry.hub.docker.com/repos/python/ ?
IPython / Jupyter have some useful Docker images:
* https://registry.hub.docker.com/repos/ipython/ * https://registry.hub.docker.com/repos/jupyter/
CI integration with roundup seems to be the major gap here:
* https://wiki.jenkins-ci.org/display/JENKINS/Docker+Plugin * https://wiki.jenkins-ci.org/display/JENKINS/saltstack-plugin * https://github.com/saltstack-formulas/docker-formula
ShiningPandas supports virtualenv and tox, but I don't know how well suited it would be for fail-fast CPython testing across a grid/graph: * https://wiki.jenkins-ci.org/display/JENKINS/ShiningPanda+Plugin * https://wiki.jenkins-ci.org/display/JENKINS/Matrix+Project+Plugin The branch merging workflows of https://datasift.github.io/gitflow/IntroducingGitFlow.html (hotfix/name, feature/name, release/name) are surely portable across VCS systems.
On 7 December 2014 at 00:11, Brett Cannon <brett@python.org> wrote:
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray <rdmurray@bitdance.com> wrote:
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over the past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
It's not just you (and Graham Dumpleton has even been working on reference images for Apache/mod_wsgi hosting of Python web services: http://blog.dscpl.com.au/2014/12/hosting-python-wsgi-applications-using.html) You still end up with Vagrant as a required element for Windows and Mac OS X, but that's pretty much a given for a lot of web service development these days. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat Dec 06 2014 at 10:30:54 AM Nick Coghlan <ncoghlan@gmail.com> wrote:
On Fri Dec 05 2014 at 8:31:27 PM R. David Murray <rdmurray@bitdance.com> wrote:
That's probably the biggest issue with *anyone* contributing to tracker maintenance, and if we could solve that, I think we could get more people interested in helping maintain it. We need the equivalent of dev-in-a-box for setting up for testing proposed changes to bugs.python.org, but including some standard way to get it deployed so others can look at a live system running the change in order to review the patch.
Maybe it's just me and all the Docker/Rocket hoopla that's occurred over
On 7 December 2014 at 00:11, Brett Cannon <brett@python.org> wrote: the
past week, but this just screams "container" to me which would make getting a test instance set up dead simple.
It's not just you (and Graham Dumpleton has even been working on reference images for Apache/mod_wsgi hosting of Python web services: http://blog.dscpl.com.au/2014/12/hosting-python-wsgi- applications-using.html)
You still end up with Vagrant as a required element for Windows and Mac OS X, but that's pretty much a given for a lot of web service development these days.
If we need a testbed then we could try it out with a devinabox and see how it works with new contributors at PyCon. Would be nice to just have Clang, all the extras for the stdlib, etc. already pulled together for people to work from.
On Fri Dec 05 2014 at 5:17:35 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
Very nice, Brett.
Thanks!
On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon@gmail.com> wrote:
And we can't forget the people who help keep all of this running as well. There are those that manage the SSH keys, the issue tracker, the review tool, hg.python.org, and the email system that let's use know when stuff happens on any of these other systems. The impact on them needs to also be considered.
It sounds like Guido would rather as much of this was done by a provider rather than relying on volunteers. That makes sense though there are concerns about control of certain assents. However, that applies only to some, like hg.python.org.
Sure, but that's also the reason Guido stuck me with the job of being the Great Decider on this. =) I have a gut feeling of how much support would need to be committed in order to consider things covered well enough (I can't give a number because it will vary depending on who steps forward; someone who I know and trust to stick around is worth more than someone who kindly steps forward and has never volunteered, but that's just because I don't know the stranger and not because I don't want people who are unknown on python-dev to step forward innately).
## Contributors I see two scenarios for contributors to optimize for. There's the simple spelling mistake patches and then there's the code change patches. The former is the kind of thing that you can do in a browser without much
and should be a no-brainer commit/reject decision for a core developer. This is what the GitHub/Bitbucket camps have been promoting their solution for solving while leaving the cpython repo alone. Unfortunately the bulk of our documentation is in the Doc/ directory of cpython. While it's nice to
effort think
about moving the devguide, peps, and even breaking out the tutorial to repos hosting on Bitbucket/GitHub, everything else is in Doc/ (language reference, howtos, stdlib, C API, etc.). So unless we want to completely break all of Doc/ out of the cpython repo and have core developers willing to edit two separate repos when making changes that impact code **and** docs, moving only a subset of docs feels like a band-aid solution that ignores the big, white elephant in the room: the cpython repo, where a bulk of patches are targeting.
With your ideal scenario this would be a moot point, right? There would be no need to split out doc-related repos.
Exactly, which is why I stressed we can't simply ignore the cpython repo. If someone is bored they could run an analysis on the various repos, calculate the number of contributions for outsiders -- maybe check the logs for the use of the work "Thank" since we typically say "Thanks to ..." -- and see how many external contributions we got in all the repos and also a detailed breakdown for Doc/.
For the code change patches, contributors need an easy way to get a hold
of
the code and get their changes to the core developers. After that it's things like letting contributors knowing that their patch doesn't apply cleanly, doesn't pass tests, etc.
This is probably more work than it seems at first.
Maybe, maybe not. Depends on what external services someone wants to rely on. E.g., could a webhook with some CI company be used so that it's more "grab the patch from here and run the tests" vs. us having to manage the whole CI infrastructure? Just because the home-grown solution requires developers and maintenance doesn't mean that the maintenance is more maintaining the code to interface with an external service provider instead of providing the service ourselves from scratch. And don't forget companies will quite possibly donate services if you ask or the PSF could pay for some things.
As of right now getting the patch into the issue tracker is a bit manual but nothing crazy. The real issue in this scenario is core developer response time.
## Core developers There is a finite amount of time that core developers get to contribute to Python and it fluctuates greatly. This means that if a process can be found which allows core developers to spend less time doing mechanical work and more time doing things that can't be automated -- namely code reviews -- then the throughput of patches being accepted/rejected will increase. This also impacts any increased patch submission rate that comes from improving the situation for contributors because if the throughput doesn't change then there will simply be more patches sitting in the issue tracker and that doesn't benefit anyone.
This is the key concern I have with only addressing the contributor side of things. I'm all for increasing contributions, but not if they are just going to rot on the tracker and we end up with disillusioned contributors.
Yep, which is why I'm saying we need a complete solution to our entire development process.
# My ideal scenario If I had an infinite amount of resources (money, volunteers, time, etc.), this would be my ideal scenario:
1. Contributor gets code from wherever; easiest to just say "fork on
or Bitbucket" as they would be official mirrors of hg.python.org and are updated after every commit, but could clone hg.python.org/cpython if
wanted 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then
GitHub they they
have browser edit access already 3. Contributor creates an account at bugs.python.org and signs the CLA
There's no real way around this, is there? I suppose account creation *could* be automated relative to a github or bitbucket user, though it probably isn't worth the effort. However, the CLA part is pretty unavoidable.
Account creation is not that heavy. We could make it so that if you create an account from e.g. a GitHub account we extract some of the details using OAuth from GitHub automatically. Once again, it's just a matter of effort.
3. The contributor creates an issue at bugs.python.org (probably the one piece of infrastructure we all agree is better than the other options, although its workflow could use an update)
I wonder if issue creation from a PR (where no issue # is in the message) could be automated too without a lot of extra work.
I'm sure it's possible. You can tell me in a PEP. =)
4. If the contributor used Bitbucket or GitHub, they send a pull request with the issue # in the PR message 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on bugs.python.org for code review 6. CI runs on the patch based on what Python versions are specified in the issue tracker, letting everyone know if it applied cleanly, passed tests on the OSs that would be affected, and also got a test coverage report 7. Core developer does a code review 8. Contributor updates their code based on the code review and the updated patch gets pulled by bugs.python.org automatically and CI runs again 9. Once the patch is acceptable and assuming the patch applies cleanly to all versions to commit to, the core developer clicks a "Commit" button, fills in a commit message and NEWS entry, and everything gets committed (if the patch can't apply cleanly then the core developer does it the old-fashion way, or maybe auto-generate a new PR which can be manually touched up so it does apply cleanly?)
6-9 sounds a lot like PEP 462. :) This seems like the part the would win us the most.
I have stated publicly multiple times that I really wanted Nick's workflow to happen, but since it is dependent on volunteers it didn't materialize. I mean this is also a lot like the GitHub+Travis/Bitbucket+drone.io|| Codeship.io workflow most other projects use -- my personal ones included -- and it's great. We just like to complicate things with 18 month release cycles and bugfix releases. =)
Basically the ideal scenario lets contributors use whatever tools and platforms that they want and provides as much automated support as
possible
to make sure their code is tip-top before and during code review while core developers can review and commit patches so easily that they can do their job from a beach with a tablet and some WiFi.
Sign me up!
Do the PEP and the work and I will! =)
## Where the current proposed solutions seem to fall short ### GitHub/Bitbucket Basically GitHub/Bitbucket is a win for contributors but doesn't buy core developers that much. GitHub/Bitbucket gives contributors the easy
cloning,
drive-by patches, CI, and PRs. Core developers get a code review tool -- I'm counting Rietveld as deprecated after Guido's comments about the code's maintenance issues -- and push-button commits **only for single branch changes**. But for any patch that crosses branches we don't really gain anything. At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push; that still keeps me off the beach, though, so that doesn't get us the whole way.
This will probably be one of the trickiest parts.
Yes, but I know for me personally and I would wager for most other core developers it's the branch merging work that is the biggest blocker from wanting to put the time in to accept a patch. And then on top of that it's simply having access to a checkout (if I could accept simple patches through a browser I could do it on my lunch break at work 5 days a week; heck I would probably make it a personal goal to try and accept a patch a day if it was simply a button press).
You could force people to submit two PRs, but I don't see that flying. Maybe some tool could be written that automatically handles the merge/commit across branches once the initial PR is in? Or automatically create a PR that core developers can touch up as necessary and then accept that as well? Regardless, some solution is necessary to handle branch-crossing PRs.
As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's interface more, but that's personal taste. I like hg more than git, but that's also personal taste (and I consider a transition from hg to git a hassle but not a deal-breaker but also not a win). It is unfortunate, though, that under this scenario we would have to choose only one platform.
It's also unfortunate both are closed-source, but that's not a deal-breaker, just a knock against if the decision is close.
### Our own infrastructure The shortcoming here is the need for developers, developers, developers! Everything outlined in the ideal scenario is totally doable on our own infrastructure with enough code and time (donated/paid-for infrastructure shouldn't be an issue). But historically that code and time has not materialized. Our code review tool is a fork that probably should be replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti maintains the issue tracker's code.
Doing something about those two tools is something to consider. Would it be out of scope for this discussion or any resulting PEPS? I have opinions here, but I'd rather not sidetrack the discussion.
I would be very happy if someone wrote up a PEP saying "we don't need to do a complete overhaul and toss everything out, we just need to tweak this stuff" or a "here is a fallback PEP to update some things if none of the proposals can solve the cpython problem" so that we basically have a PEP for considering risk mitigation. So think of this PEP as saying "we can switch to X for a review tool, we can add a GitHub/Bitbucket button for pulling from a fork by doing Y, we can use service Z as a CI service without issue through webhooks" but not necessarily worrying about issue created from PRs, etc. that might be a bit tricky; IOW the least drastic PEP that still nabs us some wins.
We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
None taken. I was thinking the same thing when I wrote that. :)
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
Sounds reasonable to me, but I don't have plans to champion a PEP. :) I could probably help with the tooling between GitHub/Bitbucket though.
And Ian Cordasco also said he could help, but I still need a PEP to work from.
Eric Snow <ericsnowcurrently@gmail.com> writes:
There's no real way around this, is there? […] the CLA part is pretty unavoidable.
The PSF presently madates that any contributor to Python sign <URL:http://legacy.python.org/psf/contrib/contrib-form/contributor-agreement.pdf> the “Contributor Agreement”. This is a unilateral grant from the contributor to the PSF, and is unequal because the PSF does not grant these same powers to the recipients of Python. I raise this, not to start another disagreement about whether this is desirable; I understand that many within the PSF regard it as an unfortunate barrier to entry, even if it is necessary. Rather, I'm asking what, specifically, necessitates this situation. What would need to change, for the PSF to accept contributions to the Python copyrighted works, without requiring the contributor to do anything but license the work under Apache 2.0 license? Is it specific code within the Python code base which somehow creates this need? How much, and how would the PSF view work to re-implement that code for contribution under Apache 2.0 license? Is it some other dependency? What, specifically; and what can be done to remove that dependency? My goal is to see the PSF reach a state where the licensing situation is an equal-footing “inbound = outbound” like most free software projects; where the PSF can happily receive from a contributor only the exact same license the PSF grants to any recipient of Python. For that to happen, we need to know the specific barriers to such a goal. What are they? -- \ “A computer once beat me at chess, but it was no match for me | `\ at kick boxing.” —Emo Philips | _o__) | Ben Finney
On 12/08/2014 02:31 PM, Ben Finney wrote:
Eric Snow <ericsnowcurrently@gmail.com> writes:
There's no real way around this, is there? […] the CLA part is pretty unavoidable.
The PSF presently madates that any contributor to Python sign <URL:http://legacy.python.org/psf/contrib/contrib-form/contributor-agreement.pdf> the “Contributor Agreement”. This is a unilateral grant from the contributor to the PSF, and is unequal because the PSF does not grant these same powers to the recipients of Python.
I raise this, not to start another disagreement about whether this is desirable; I understand that many within the PSF regard it as an unfortunate barrier to entry, even if it is necessary.
Rather, I'm asking what, specifically, necessitates this situation.
What would need to change, for the PSF to accept contributions to the Python copyrighted works, without requiring the contributor to do anything but license the work under Apache 2.0 license?
Is it specific code within the Python code base which somehow creates this need? How much, and how would the PSF view work to re-implement that code for contribution under Apache 2.0 license?
Is it some other dependency? What, specifically; and what can be done to remove that dependency?
My goal is to see the PSF reach a state where the licensing situation is an equal-footing “inbound = outbound” like most free software projects; where the PSF can happily receive from a contributor only the exact same license the PSF grants to any recipient of Python.
For that to happen, we need to know the specific barriers to such a goal. What are they?
Well, this is the wrong mailing list for those questions. Maybe one of these would work instead? About Python-legal-sig (https://mail.python.org/mailman/listinfo/python-legal-sig) English (USA) This list is for the discussion of Python Legal/Compliance issues. Its focus should be on questions regarding compliance, copyrights on core python, etc. Actual Legal decisions, or legal counsel questions, alterations to the Contributor License Agreement for Python the language should be sent to psf@python.org Python/PSF trademark questions should be sent to psf-trademarks@python.org. Please Note: Legal decisions affecting the IP, Python license stack, etc *must* be approved by Python Software Foundation legal counsel and the board of directors: psf@python.org To see the collection of prior postings to the list, visit the Python-legal-sig Archives.
Ethan Furman <ethan@stoneleaf.us> writes:
Well, this is the wrong mailing list for those questions.
Thanks. I addressed the claim here where it was made; but you're right that a different forum is better for an ongoing discussion about this topic. Barry Warsaw <barry@python.org> writes:
My understanding is that the PSF needs the ability to relicense the contribution under the standard PSF license, and it is the contributor agreement that gives the PSF the legal right to do this.
Okay, that's been raised before. If anyone can cite other specific dependencies that would necessitate a CLA for Python, please contact me off-list, and/or in the Python legal-sig <URL:https://mail.python.org/mailman/listinfo/python-legal-sig>.
Many organizations, both for- and non-profit have this legal requirement, and there are many avenues for satisfying these needs, mostly based on different legal and business interpretations.
And many do not. It would be good to shift the PSF into the larger set of organisations that do not require a CLA for accepting contributions. Thanks, all. Sorry to bring the topic up again here. -- \ “When I was born I was so surprised I couldn't talk for a year | `\ and a half.” —Gracie Allen | _o__) | Ben Finney
On Dec 09, 2014, at 09:31 AM, Ben Finney wrote:
Rather, I'm asking what, specifically, necessitates this situation.
What would need to change, for the PSF to accept contributions to the Python copyrighted works, without requiring the contributor to do anything but license the work under Apache 2.0 license?
My understanding is that the PSF needs the ability to relicense the contribution under the standard PSF license, and it is the contributor agreement that gives the PSF the legal right to do this. Many organizations, both for- and non-profit have this legal requirement, and there are many avenues for satisfying these needs, mostly based on different legal and business interpretations. In the scheme of such things, and IMHO, the PSF CLA is quite reasonable and lightweight, both in what it requires a contributor to provide, and in the value, rights, and guarantees it extends to the contributor. Cheers, -Barry
On 9 Dec 2014 08:47, "Barry Warsaw" <barry@python.org> wrote:
On Dec 09, 2014, at 09:31 AM, Ben Finney wrote:
Rather, I'm asking what, specifically, necessitates this situation.
What would need to change, for the PSF to accept contributions to the Python copyrighted works, without requiring the contributor to do anything but license the work under Apache 2.0 license?
My understanding is that the PSF needs the ability to relicense the contribution under the standard PSF license, and it is the contributor agreement that gives the PSF the legal right to do this.
This matches my understanding as well. The problem is that the PSF licence itself isn't suitable as "licence in", and changing the "licence out" could have a broad ripple effect on downstream consumers (especially since the early history means "just change the outgoing license to the Apache License" isn't an available option, at least as far as I am aware). A more restricted CLA that limited the PSF's outgoing licence choices to OSI approved open source licenses might address some of the concerns without causing problems elsewhere, but the combination of being both interested in core development and having a philosophical or personal objection to signing the CLA seems to be genuinely rare. Cheers, Nick.
Many organizations, both for- and non-profit have this legal requirement,
and
there are many avenues for satisfying these needs, mostly based on different legal and business interpretations. In the scheme of such things, and IMHO, the PSF CLA is quite reasonable and lightweight, both in what it requires a contributor to provide, and in the value, rights, and guarantees it extends to the contributor.
Cheers, -Barry _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
On Dec 09, 2014, at 07:42 PM, Nick Coghlan wrote:
A more restricted CLA that limited the PSF's outgoing licence choices to OSI approved open source licenses might address some of the concerns without causing problems elsewhere, but the combination of being both interested in core development and having a philosophical or personal objection to signing the CLA seems to be genuinely rare.
The CLA does explicitly say "Contributor understands and agrees that PSF shall have the irrevocable and perpetual right to make and distribute copies of any Contribution, as well as to create and distribute collective works and derivative works of any Contribution, under the Initial License or under any other open source license approved by a unanimous vote of the PSF board." So while not explicitly limited to an OSI approved license, it must still be "open source", at least in the view of the entire (unanimous) PSF board. "OSI approved" would probably be the least controversial definition of "open source" that the PSF could adopt. Cheers, -Barry
On 12/5/2014 3:04 PM, Brett Cannon wrote:
1. Contributor clones a repository from hg.python.org <http://hg.python.org> 2. Contributor makes desired changes 3. Contributor generates a patch 4. Contributor creates account on bugs.python.org <http://bugs.python.org> and signs the [contributor agreement](https://www.python.org/psf/contrib/contrib-form/)
I would like to have the process of requesting and enforcing the signing of CAs automated.
4. Contributor creates an issue on bugs.python.org <http://bugs.python.org> (if one does not already exist) and uploads a patch
I would like to have patches rejected, or at least held up, until a CA is registered. For this to work, a signed CA should be immediately registered on the tracker, at least as 'pending'. It now can take a week or more to go through human processing.
5. Core developer evaluates patch, possibly leaving comments through our [custom version of Rietveld](http://bugs.python.org/review/) 6. Contributor revises patch based on feedback and uploads new patch 7. Core developer downloads patch and applies it to a clean clone 8. Core developer runs the tests 9. Core developer does one last `hg pull -u` and then commits the changes to various branches
-- Terry Jan Reedy
On Sat Dec 06 2014 at 2:53:43 AM Terry Reedy <tjreedy@udel.edu> wrote:
On 12/5/2014 3:04 PM, Brett Cannon wrote:
1. Contributor clones a repository from hg.python.org < http://hg.python.org> 2. Contributor makes desired changes 3. Contributor generates a patch 4. Contributor creates account on bugs.python.org <http://bugs.python.org> and signs the [contributor agreement](https://www.python.org/psf/contrib/contrib-form/)
I would like to have the process of requesting and enforcing the signing of CAs automated.
So would I.
4. Contributor creates an issue on bugs.python.org <http://bugs.python.org> (if one does not already exist) and uploads a patch
I would like to have patches rejected, or at least held up, until a CA is registered. For this to work, a signed CA should be immediately registered on the tracker, at least as 'pending'. It now can take a week or more to go through human processing.
This is one of the reasons I didn't want to create an issue magically from PRs initially. I think it's totally doable with some coding. -Brett
5. Core developer evaluates patch, possibly leaving comments through our [custom version of Rietveld](http://bugs.python.org/review/) 6. Contributor revises patch based on feedback and uploads new patch 7. Core developer downloads patch and applies it to a clean clone 8. Core developer runs the tests 9. Core developer does one last `hg pull -u` and then commits the changes to various branches
-- Terry Jan Reedy
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org
Brett Cannon wrote:
4. Contributor creates account on bugs.python.org and signs the [contributor agreement](https://www.python.org/psf/contrib/contrib-form/)
Is there an expiration on such forms? If there doesn't need to be (and one form is good for multiple tickets), is there an objection (besides "not done yet") to making "signed the form" part of the bug reporter account, and required to submit to the CI process? (An "I can't sign yet, bug me later" option would allow the current workflow without the "this isn't technically a patch" workaround for "small enough" patches from those with slow-moving employers.)
There's the simple spelling mistake patches and then there's the code change patches.
There are a fair number of one-liner code patches; ideally, they could also be handled quickly.
For the code change patches, contributors need an easy way to get a hold of the code and get their changes to the core developers.
For a fair number of patches, the same workflow as spelling errors is appropriate, except that it would be useful to have an automated state saying "yes, this currently merges fine", so that committers can focus only on patches that are (still) at least that ready.
At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push;
Is it common for a patch that should apply to multiple branches to fail on some but not all of them? In other words, is there any reason beyond "not done yet" that submitting a patch (or pull request) shouldn't automatically create a patch per branch, with pushbuttons to test/reject/commit?
Our code review tool is a fork that probably should be replaced as only Martin von Loewis can maintain it.
Only he knows the innards, or only he is authorized, or only he knows where the code currently is/how to deploy an update? I know that there were times in the (not-so-recent) past when I had time and willingness to help with some part of the infrastructure, but didn't know where the code was, and didn't feel right making a blind offer. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ
On Mon Dec 08 2014 at 3:27:43 PM Jim J. Jewett <jimjjewett@gmail.com> wrote:
Brett Cannon wrote:
4. Contributor creates account on bugs.python.org and signs the [contributor agreement](https://www.python. org/psf/contrib/contrib-form/)
Is there an expiration on such forms? If there doesn't need to be (and one form is good for multiple tickets), is there an objection (besides "not done yet") to making "signed the form" part of the bug reporter account, and required to submit to the CI process? (An "I can't sign yet, bug me later" option would allow the current workflow without the "this isn't technically a patch" workaround for "small enough" patches from those with slow-moving employers.)
IANAL but I believe that as long as you didn't sign on behalf of work for your employer it's good for life.
There's the simple spelling mistake patches and then there's the code change patches.
There are a fair number of one-liner code patches; ideally, they could also be handled quickly.
Depends on the change. Syntactic typos could still get through. But yes, they are also a possibility for a quick submission.
For the code change patches, contributors need an easy way to get a hold of the code and get their changes to the core developers.
For a fair number of patches, the same workflow as spelling errors is appropriate, except that it would be useful to have an automated state saying "yes, this currently merges fine", so that committers can focus only on patches that are (still) at least that ready.
At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push;
Is it common for a patch that should apply to multiple branches to fail on some but not all of them?
Going from 3.4 -> 3.5 is almost always clean sans NEWS, but from 2.7 it is no where near as guaranteed.
In other words, is there any reason beyond "not done yet" that submitting a patch (or pull request) shouldn't automatically create a patch per branch, with pushbuttons to test/reject/commit?
Assuming that you specify which branches, then not really. But if it is blindly then yes as that's unnecessary noise and could lead to arguments over whether something should (not) be applied to some specific version.
Our code review tool is a fork that probably should be replaced as only Martin von Loewis can maintain it.
Only he knows the innards, or only he is authorized, or only he knows where the code currently is/how to deploy an update?
Innards. -Brett
I know that there were times in the (not-so-recent) past when I had time and willingness to help with some part of the infrastructure, but didn't know where the code was, and didn't feel right making a blind offer.
-jJ
--
If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org
On Mon, 08 Dec 2014 12:27:23 -0800, "Jim J. Jewett" <jimjjewett@gmail.com> wrote:
Brett Cannon wrote:
4. Contributor creates account on bugs.python.org and signs the [contributor agreement](https://www.python.org/psf/contrib/contrib-form/)
Is there an expiration on such forms? If there doesn't need to be (and one form is good for multiple tickets), is there an objection (besides "not done yet") to making "signed the form" part of the bug reporter account, and required to submit to the CI process? (An "I can't sign yet, bug me later" option would allow the current workflow without the "this isn't technically a patch" workaround for "small enough" patches from those with slow-moving employers.)
No expiration. Whether or not we have a CLA from a given tracker id is recorded in the tracker. People also get reminded to submit a CLA if they haven't yet but have submitted a patch.
At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push;
Is it common for a patch that should apply to multiple branches to fail on some but not all of them?
Currently? Yes when 2.7 is involved. If we fix NEWS, then it won't be *common* for maint->default, but it will happen.
In other words, is there any reason beyond "not done yet" that submitting a patch (or pull request) shouldn't automatically create a patch per branch, with pushbuttons to test/reject/commit?
Not Done Yet (by any of the tools we know about) is the only reason I'm aware of.
Our code review tool is a fork that probably should be replaced as only Martin von Loewis can maintain it.
Only he knows the innards, or only he is authorized, or only he knows where the code currently is/how to deploy an update?
Only he knows the innards. (Although Ezio has made at least one patch to it.) I think Guido's point was that we (the community) shouldn't be maintaining this private fork of a project that has moved on well beyond us; instead we should be using an active project and leveraging its community with our own contributions (like we do with Roundup).
I know that there were times in the (not-so-recent) past when I had time and willingness to help with some part of the infrastructure, but didn't know where the code was, and didn't feel right making a blind offer.
Yeah, that's something that's been getting better lately (thanks, infrastructure team), but where to get the info is still not as clear as would be optimal. --David
As I didn't hear any objections, I'm officially stating that I expect initial draft PEPs to be in by February 1 to know who is in the running to focus discussion. I then expect complete PEPs by April 1 so I can read them before PyCon and have informed discussions while I'm there. I will then plan to make a final decision by May 1 so that we can try to have the changes ready for Python 3.6 development (currently scheduled for Sep 2015). On Fri Dec 05 2014 at 3:04:48 PM Brett Cannon <bcannon@gmail.com> wrote:
This is a bit long as I espoused as if this was a blog post to try and give background info on my thinking, etc. The TL;DR folks should start at the "Ideal Scenario" section and read to the end.
P.S.: This is in Markdown and I have put it up at https://gist.github.com/brettcannon/a9c9a5989dc383ed73b4 if you want a nicer formatted version for reading.
# History lesson Since I signed up for the python-dev mailing list way back in June 2002, there seems to be a cycle where we as a group come to a realization that our current software development process has not kept up with modern practices and could stand for an update. For me this was first shown when we moved from SourceForge to our own infrastructure, then again when we moved from Subversion to Mercurial (I led both of these initiatives, so it's somewhat a tradition/curse I find myself in this position yet again). And so we again find ourselves at the point of realizing that we are not keeping up with current practices and thus need to evaluate how we can improve our situation.
# Where we are now Now it should be realized that we have to sets of users of our development process: contributors and core developers (the latter whom can play both roles). If you take a rough outline of our current, recommended process it goes something like this:
1. Contributor clones a repository from hg.python.org 2. Contributor makes desired changes 3. Contributor generates a patch 4. Contributor creates account on bugs.python.org and signs the [contributor agreement]( https://www.python.org/psf/contrib/contrib-form/) 4. Contributor creates an issue on bugs.python.org (if one does not already exist) and uploads a patch 5. Core developer evaluates patch, possibly leaving comments through our [custom version of Rietveld](http://bugs.python.org/review/) 6. Contributor revises patch based on feedback and uploads new patch 7. Core developer downloads patch and applies it to a clean clone 8. Core developer runs the tests 9. Core developer does one last `hg pull -u` and then commits the changes to various branches
I think we can all agree it works to some extent, but isn't exactly smooth. There are multiple steps in there -- in full or partially -- that can be automated. There is room to improve everyone's lives.
And we can't forget the people who help keep all of this running as well. There are those that manage the SSH keys, the issue tracker, the review tool, hg.python.org, and the email system that let's use know when stuff happens on any of these other systems. The impact on them needs to also be considered.
## Contributors I see two scenarios for contributors to optimize for. There's the simple spelling mistake patches and then there's the code change patches. The former is the kind of thing that you can do in a browser without much effort and should be a no-brainer commit/reject decision for a core developer. This is what the GitHub/Bitbucket camps have been promoting their solution for solving while leaving the cpython repo alone. Unfortunately the bulk of our documentation is in the Doc/ directory of cpython. While it's nice to think about moving the devguide, peps, and even breaking out the tutorial to repos hosting on Bitbucket/GitHub, everything else is in Doc/ (language reference, howtos, stdlib, C API, etc.). So unless we want to completely break all of Doc/ out of the cpython repo and have core developers willing to edit two separate repos when making changes that impact code **and** docs, moving only a subset of docs feels like a band-aid solution that ignores the big, white elephant in the room: the cpython repo, where a bulk of patches are targeting.
For the code change patches, contributors need an easy way to get a hold of the code and get their changes to the core developers. After that it's things like letting contributors knowing that their patch doesn't apply cleanly, doesn't pass tests, etc. As of right now getting the patch into the issue tracker is a bit manual but nothing crazy. The real issue in this scenario is core developer response time.
## Core developers There is a finite amount of time that core developers get to contribute to Python and it fluctuates greatly. This means that if a process can be found which allows core developers to spend less time doing mechanical work and more time doing things that can't be automated -- namely code reviews -- then the throughput of patches being accepted/rejected will increase. This also impacts any increased patch submission rate that comes from improving the situation for contributors because if the throughput doesn't change then there will simply be more patches sitting in the issue tracker and that doesn't benefit anyone.
# My ideal scenario If I had an infinite amount of resources (money, volunteers, time, etc.), this would be my ideal scenario:
1. Contributor gets code from wherever; easiest to just say "fork on GitHub or Bitbucket" as they would be official mirrors of hg.python.org and are updated after every commit, but could clone hg.python.org/cpython if they wanted 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then they have browser edit access already 3. Contributor creates an account at bugs.python.org and signs the CLA 3. The contributor creates an issue at bugs.python.org (probably the one piece of infrastructure we all agree is better than the other options, although its workflow could use an update) 4. If the contributor used Bitbucket or GitHub, they send a pull request with the issue # in the PR message 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on bugs.python.org for code review 6. CI runs on the patch based on what Python versions are specified in the issue tracker, letting everyone know if it applied cleanly, passed tests on the OSs that would be affected, and also got a test coverage report 7. Core developer does a code review 8. Contributor updates their code based on the code review and the updated patch gets pulled by bugs.python.org automatically and CI runs again 9. Once the patch is acceptable and assuming the patch applies cleanly to all versions to commit to, the core developer clicks a "Commit" button, fills in a commit message and NEWS entry, and everything gets committed (if the patch can't apply cleanly then the core developer does it the old-fashion way, or maybe auto-generate a new PR which can be manually touched up so it does apply cleanly?)
Basically the ideal scenario lets contributors use whatever tools and platforms that they want and provides as much automated support as possible to make sure their code is tip-top before and during code review while core developers can review and commit patches so easily that they can do their job from a beach with a tablet and some WiFi.
## Where the current proposed solutions seem to fall short ### GitHub/Bitbucket Basically GitHub/Bitbucket is a win for contributors but doesn't buy core developers that much. GitHub/Bitbucket gives contributors the easy cloning, drive-by patches, CI, and PRs. Core developers get a code review tool -- I'm counting Rietveld as deprecated after Guido's comments about the code's maintenance issues -- and push-button commits **only for single branch changes**. But for any patch that crosses branches we don't really gain anything. At best core developers tell a contributor "please send your PR against 3.4", push-button merge it, update a local clone, merge from 3.4 to default, do the usual stuff, commit, and then push; that still keeps me off the beach, though, so that doesn't get us the whole way. You could force people to submit two PRs, but I don't see that flying. Maybe some tool could be written that automatically handles the merge/commit across branches once the initial PR is in? Or automatically create a PR that core developers can touch up as necessary and then accept that as well? Regardless, some solution is necessary to handle branch-crossing PRs.
As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's interface more, but that's personal taste. I like hg more than git, but that's also personal taste (and I consider a transition from hg to git a hassle but not a deal-breaker but also not a win). It is unfortunate, though, that under this scenario we would have to choose only one platform.
It's also unfortunate both are closed-source, but that's not a deal-breaker, just a knock against if the decision is close.
### Our own infrastructure The shortcoming here is the need for developers, developers, developers! Everything outlined in the ideal scenario is totally doable on our own infrastructure with enough code and time (donated/paid-for infrastructure shouldn't be an issue). But historically that code and time has not materialized. Our code review tool is a fork that probably should be replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti maintains the issue tracker's code. We don't exactly have a ton of people constantly going "I'm so bored because everything for Python's development infrastructure gets sorted so quickly!" A perfect example is that R. David Murray came up with a nice update for our workflow after PyCon but then ran out of time after mostly defining it and nothing ever became of it (maybe we can rectify that at PyCon?). Eric Snow has pointed out how he has written similar code for pulling PRs from I think GitHub to another code review tool, but that doesn't magically make it work in our infrastructure or get someone to write it and help maintain it (no offense, Eric).
IOW our infrastructure can do anything, but it can't run on hopes and dreams. Commitments from many people to making this happen by a certain deadline will be needed so as to not allow it to drag on forever. People would also have to commit to continued maintenance to make this viable long-term.
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
On Dec 11, 2014, at 9:59 AM, Brett Cannon <bcannon@gmail.com> wrote:
As I didn't hear any objections, I'm officially stating that I expect initial draft PEPs to be in by February 1 to know who is in the running to focus discussion. I then expect complete PEPs by April 1 so I can read them before PyCon and have informed discussions while I'm there. I will then plan to make a final decision by May 1 so that we can try to have the changes ready for Python 3.6 development (currently scheduled for Sep 2015).
Is it OK to adapt my current PEP or should I create a whole new one? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
Just adapt your current PEP. On Thu Dec 11 2014 at 10:02:23 AM Donald Stufft <donald@stufft.io> wrote:
On Dec 11, 2014, at 9:59 AM, Brett Cannon <bcannon@gmail.com> wrote:
As I didn't hear any objections, I'm officially stating that I expect initial draft PEPs to be in by February 1 to know who is in the running to focus discussion. I then expect complete PEPs by April 1 so I can read them before PyCon and have informed discussions while I'm there. I will then plan to make a final decision by May 1 so that we can try to have the changes ready for Python 3.6 development (currently scheduled for Sep 2015).
Is it OK to adapt my current PEP or should I create a whole new one?
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 6 December 2014 at 06:04, Brett Cannon <bcannon@gmail.com> wrote:
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
I've now updated PEP 474 to cover my current proposal for the support repositories, as well as some of the preparatory work that is already being undertaken: https://www.python.org/dev/peps/pep-0474/ By the end of the month, I'll also aim to have an updated version of PEP 462 published that considers how the forge.python.org service could potentially be extended to handle CPython itself, rather than attempting to build those flows directly into the existing Roundup and Rietveld based approach. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
I did a pull-request with current progress: https://github.com/python/psf-salt/pull/25 Any feedback is appreciated. Btw: Donald is very patient and helpful. :) On Thu Jan 08 2015 at 8:00:59 AM Nick Coghlan <ncoghlan@gmail.com> wrote:
On 6 December 2014 at 06:04, Brett Cannon <bcannon@gmail.com> wrote:
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
I've now updated PEP 474 to cover my current proposal for the support repositories, as well as some of the preparatory work that is already being undertaken: https://www.python.org/dev/peps/pep-0474/
By the end of the month, I'll also aim to have an updated version of PEP 462 published that considers how the forge.python.org service could potentially be extended to handle CPython itself, rather than attempting to build those flows directly into the existing Roundup and Rietveld based approach.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ tymoteusz.jankowski%40gmail.com
On Jan 8, 2015, at 4:26 AM, Tymoteusz Jankowski <tymoteusz.jankowski@gmail.com> wrote:
I did a pull-request with current progress: https://github.com/python/psf-salt/pull/25 <https://github.com/python/psf-salt/pull/25> Any feedback is appreciated. Btw: Donald is very patient and helpful. :)
Ah oops, I forgot to review that. *goes to do so now*. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Thu, Jan 8, 2015 at 12:59 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 6 December 2014 at 06:04, Brett Cannon <bcannon@gmail.com> wrote:
# Next steps I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks away), all details worked out in final PEPs and whatever is required to prove to me it will work by the PyCon language summit (4 months away). I make a decision by May 1, and then implementation aims to be done by the time 3.5.0 is cut so we can switch over shortly thereafter (9 months away). Sound like a reasonable timeline?
I've now updated PEP 474 to cover my current proposal for the support repositories, as well as some of the preparatory work that is already being undertaken: https://www.python.org/dev/peps/pep-0474/
By the end of the month, I'll also aim to have an updated version of PEP 462 published that considers how the forge.python.org service could potentially be extended to handle CPython itself, rather than attempting to build those flows directly into the existing Roundup and Rietveld based approach.
There could be JSON[-LD] schema to describe resources with attributes: https://github.com/westurner/wiki/wiki/ideas#open-source-mailing-list-extrac...
There could be configurable per-list link heuristics:
- http[s]
- Issue: https://bugs.python.org/issue(d+)
- Src: https://hg.python.org/<repo>/<path>
- Src: https://github.com/<org>/<project>/<path>
- Src: https://bitbucket.org/<org>/<project>/<path>
- Patch/Attachment: http[s]://bugs.python.org/(file[d]+)/ <filename(.diff)>
- Doc: https://docs.python.org/<ver>/<path>
- Wiki: https://wiki.python.org/moin/<path>
- Homepage: https://www.python.org/<path>
- PyPI pkg: https://pypi.python.org/pypi/<path>
- Warehouse pkg: https://warehouse.python.org/project/<path>
- Wikipedia: https://[lang].wikipedia.org/wiki/<page> --> (dbpedia:<page>)
- Build: http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.4/builds/77...
participants (15)
-
Barry Warsaw
-
Ben Finney
-
Brett Cannon
-
Brett Cannon
-
Donald Stufft
-
Eric Snow
-
Ethan Furman
-
Ian Cordasco
-
Jim J. Jewett
-
Nick Coghlan
-
R. David Murray
-
Shorya Raj
-
Terry Reedy
-
Tymoteusz Jankowski
-
Wes Turner