[Python-Dev] My thinking about the development process
Shorya Raj
rajshorya at gmail.com
Sat Dec 6 00:15:09 CET 2014
Hi All
I just want to put my two cents into this.
This would definitely be a great step to take. I have been discussing PEP
462 with Nick, and the automation was definitely something that would be
something that would be great to have - I mean, I was submitting a simple
documentation patch for build CPython on Windows, and it took several weeks
for the patch to be accepted, then a couple of months for the patch to
actually be merged in. As mentioned, automated testing to ensure that tests
pass, along with easier comitting of documentation patches, would obviously
be a great way to start to decrease this turn around.
Has there been any though on what sort off infrastructure we could use for
this? Obviously github / bitbucket could be used as mentioned by others for
repo management, but a lot of thought would have to go into the decisions
regarding CI tools. I think it would also be a good time to address the
issues with the current bug tracker - although it works, it is hardly as
useable as some of the other ones. As for the argument that we should use
open source tools to ensure that the owners of these tools aren't able to
cause us problems in the future - both Hadoop and Cassandra, along with a
lot of other Apache projects seem to be using JIRA just fine.
Thanks
Shorya Raj
On Sat, Dec 6, 2014 at 11:17 AM, Eric Snow <ericsnowcurrently at gmail.com>
wrote:
> Very nice, Brett.
>
> On Fri, Dec 5, 2014 at 1:04 PM, Brett Cannon <bcannon at gmail.com> wrote:
> > And we can't forget the people who help keep all of this running as well.
> > There are those that manage the SSH keys, the issue tracker, the review
> > tool, hg.python.org, and the email system that let's use know when stuff
> > happens on any of these other systems. The impact on them needs to also
> be
> > considered.
>
> It sounds like Guido would rather as much of this was done by a
> provider rather than relying on volunteers. That makes sense though
> there are concerns about control of certain assents. However, that
> applies only to some, like hg.python.org.
>
> >
> > ## Contributors
> > I see two scenarios for contributors to optimize for. There's the simple
> > spelling mistake patches and then there's the code change patches. The
> > former is the kind of thing that you can do in a browser without much
> effort
> > and should be a no-brainer commit/reject decision for a core developer.
> This
> > is what the GitHub/Bitbucket camps have been promoting their solution for
> > solving while leaving the cpython repo alone. Unfortunately the bulk of
> our
> > documentation is in the Doc/ directory of cpython. While it's nice to
> think
> > about moving the devguide, peps, and even breaking out the tutorial to
> repos
> > hosting on Bitbucket/GitHub, everything else is in Doc/ (language
> reference,
> > howtos, stdlib, C API, etc.). So unless we want to completely break all
> of
> > Doc/ out of the cpython repo and have core developers willing to edit two
> > separate repos when making changes that impact code **and** docs, moving
> > only a subset of docs feels like a band-aid solution that ignores the
> big,
> > white elephant in the room: the cpython repo, where a bulk of patches are
> > targeting.
>
> With your ideal scenario this would be a moot point, right? There
> would be no need to split out doc-related repos.
>
> >
> > For the code change patches, contributors need an easy way to get a hold
> of
> > the code and get their changes to the core developers. After that it's
> > things like letting contributors knowing that their patch doesn't apply
> > cleanly, doesn't pass tests, etc.
>
> This is probably more work than it seems at first.
>
> > As of right now getting the patch into the
> > issue tracker is a bit manual but nothing crazy. The real issue in this
> > scenario is core developer response time.
> >
> > ## Core developers
> > There is a finite amount of time that core developers get to contribute
> to
> > Python and it fluctuates greatly. This means that if a process can be
> found
> > which allows core developers to spend less time doing mechanical work and
> > more time doing things that can't be automated -- namely code reviews --
> > then the throughput of patches being accepted/rejected will increase.
> This
> > also impacts any increased patch submission rate that comes from
> improving
> > the situation for contributors because if the throughput doesn't change
> then
> > there will simply be more patches sitting in the issue tracker and that
> > doesn't benefit anyone.
>
> This is the key concern I have with only addressing the contributor
> side of things. I'm all for increasing contributions, but not if they
> are just going to rot on the tracker and we end up with disillusioned
> contributors.
>
> >
> > # My ideal scenario
> > If I had an infinite amount of resources (money, volunteers, time, etc.),
> > this would be my ideal scenario:
> >
> > 1. Contributor gets code from wherever; easiest to just say "fork on
> GitHub
> > or Bitbucket" as they would be official mirrors of hg.python.org and are
> > updated after every commit, but could clone hg.python.org/cpython if
> they
> > wanted
> > 2. Contributor makes edits; if they cloned on Bitbucket or GitHub then
> they
> > have browser edit access already
> > 3. Contributor creates an account at bugs.python.org and signs the CLA
>
> There's no real way around this, is there? I suppose account creation
> *could* be automated relative to a github or bitbucket user, though it
> probably isn't worth the effort. However, the CLA part is pretty
> unavoidable.
>
> > 3. The contributor creates an issue at bugs.python.org (probably the one
> > piece of infrastructure we all agree is better than the other options,
> > although its workflow could use an update)
>
> I wonder if issue creation from a PR (where no issue # is in the
> message) could be automated too without a lot of extra work.
>
> > 4. If the contributor used Bitbucket or GitHub, they send a pull request
> > with the issue # in the PR message
> > 5. bugs.python.org notices the PR, grabs a patch for it, and puts it on
> > bugs.python.org for code review
> > 6. CI runs on the patch based on what Python versions are specified in
> the
> > issue tracker, letting everyone know if it applied cleanly, passed tests
> on
> > the OSs that would be affected, and also got a test coverage report
> > 7. Core developer does a code review
> > 8. Contributor updates their code based on the code review and the
> updated
> > patch gets pulled by bugs.python.org automatically and CI runs again
> > 9. Once the patch is acceptable and assuming the patch applies cleanly to
> > all versions to commit to, the core developer clicks a "Commit" button,
> > fills in a commit message and NEWS entry, and everything gets committed
> (if
> > the patch can't apply cleanly then the core developer does it the
> > old-fashion way, or maybe auto-generate a new PR which can be manually
> > touched up so it does apply cleanly?)
>
> 6-9 sounds a lot like PEP 462. :) This seems like the part the would
> win us the most.
>
> >
> > Basically the ideal scenario lets contributors use whatever tools and
> > platforms that they want and provides as much automated support as
> possible
> > to make sure their code is tip-top before and during code review while
> core
> > developers can review and commit patches so easily that they can do their
> > job from a beach with a tablet and some WiFi.
>
> Sign me up!
>
> >
> > ## Where the current proposed solutions seem to fall short
> > ### GitHub/Bitbucket
> > Basically GitHub/Bitbucket is a win for contributors but doesn't buy core
> > developers that much. GitHub/Bitbucket gives contributors the easy
> cloning,
> > drive-by patches, CI, and PRs. Core developers get a code review tool --
> I'm
> > counting Rietveld as deprecated after Guido's comments about the code's
> > maintenance issues -- and push-button commits **only for single branch
> > changes**. But for any patch that crosses branches we don't really gain
> > anything. At best core developers tell a contributor "please send your PR
> > against 3.4", push-button merge it, update a local clone, merge from 3.4
> to
> > default, do the usual stuff, commit, and then push; that still keeps me
> off
> > the beach, though, so that doesn't get us the whole way.
>
> This will probably be one of the trickiest parts.
>
> > You could force
> > people to submit two PRs, but I don't see that flying. Maybe some tool
> could
> > be written that automatically handles the merge/commit across branches
> once
> > the initial PR is in? Or automatically create a PR that core developers
> can
> > touch up as necessary and then accept that as well? Regardless, some
> > solution is necessary to handle branch-crossing PRs.
> >
> > As for GitHub vs. Bitbucket, I personally don't care. I like GitHub's
> > interface more, but that's personal taste. I like hg more than git, but
> > that's also personal taste (and I consider a transition from hg to git a
> > hassle but not a deal-breaker but also not a win). It is unfortunate,
> > though, that under this scenario we would have to choose only one
> platform.
> >
> > It's also unfortunate both are closed-source, but that's not a
> deal-breaker,
> > just a knock against if the decision is close.
> >
> > ### Our own infrastructure
> > The shortcoming here is the need for developers, developers, developers!
> > Everything outlined in the ideal scenario is totally doable on our own
> > infrastructure with enough code and time (donated/paid-for infrastructure
> > shouldn't be an issue). But historically that code and time has not
> > materialized. Our code review tool is a fork that probably should be
> > replaced as only Martin von Löwis can maintain it. Basically Ezio Melotti
> > maintains the issue tracker's code.
>
> Doing something about those two tools is something to consider. Would
> it be out of scope for this discussion or any resulting PEPS? I have
> opinions here, but I'd rather not sidetrack the discussion.
>
> > We don't exactly have a ton of people
> > constantly going "I'm so bored because everything for Python's
> development
> > infrastructure gets sorted so quickly!" A perfect example is that R.
> David
> > Murray came up with a nice update for our workflow after PyCon but then
> ran
> > out of time after mostly defining it and nothing ever became of it
> (maybe we
> > can rectify that at PyCon?). Eric Snow has pointed out how he has written
> > similar code for pulling PRs from I think GitHub to another code review
> > tool, but that doesn't magically make it work in our infrastructure or
> get
> > someone to write it and help maintain it (no offense, Eric).
>
> None taken. I was thinking the same thing when I wrote that. :)
>
> >
> > IOW our infrastructure can do anything, but it can't run on hopes and
> > dreams. Commitments from many people to making this happen by a certain
> > deadline will be needed so as to not allow it to drag on forever. People
> > would also have to commit to continued maintenance to make this viable
> > long-term.
> >
> > # Next steps
> > I'm thinking first draft PEPs by February 1 to know who's all-in (8 weeks
> > away), all details worked out in final PEPs and whatever is required to
> > prove to me it will work by the PyCon language summit (4 months away). I
> > make a decision by May 1, and
> > then implementation aims to be done by the time 3.5.0 is cut so we can
> > switch over shortly thereafter (9 months away). Sound like a reasonable
> > timeline?
>
> Sounds reasonable to me, but I don't have plans to champion a PEP. :)
> I could probably help with the tooling between GitHub/Bitbucket
> though.
>
> -eric
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/rajshorya%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20141206/08cf66d9/attachment-0001.html>
More information about the Python-Dev
mailing list