[Python-Dev] Move selected documentation repos to PSF BitBucket account?

Georg Brandl g.brandl at gmx.net
Sun Nov 23 21:03:41 CET 2014


On 11/23/2014 05:55 PM, Brett Cannon wrote:

> I guess my question is who and what is going to be disrupted if we go with
> Guido's suggestion of switching to GitHub for code hosting? Contributors won't
> be disrupted at all since most people are more familiar with GitHub vs.
> Bitbucket (how many times have we all heard the fact someone has even learned
> Mercurial just to contribute to Python?). Core developers might be based on some
> learned workflow, but I'm willing to bet we all know git at this point (and for
> those of us who still don't like it, myself included, there are GUI apps to
> paper over it or hg-git for those that prefer a CLI). Our infrastructure will
> need to be updated, but how much of it is that hg-specific short of the command
> to checkout out the repo? Obviously Bitbucket is much more minor by simply
> updating just a URL, but changing `hg clone` to `git clone` isn't crazy either.
> Georg, Antoine, or Benjamin can point out if I'm wrong on this, maybe Donald or
> someone in the infrastructure committee.

Well, since "most people" already know git this part is probably not a big deal,
although we'd have to consider alienating core developers who are not git-savvy.

> Probably the biggest thing I can think of that would need updating is our commit
> hooks. Once again Georg, Antoine, or Benjamin could say how difficult it would
> be to update those hooks.

There are two categories of hooks we use: hooks that run before a push is
accepted, and hooks that run afterwards.  The latter are not a concern, since
anything that GH/BB doesn't support can be run as a web service on python.org
infrastructure, which then gets POST requests from the hosting platforms.  These
are

* tracker update
* IRC notification
* sending email to python-checkins
* triggering buildbot

The more problematic category are pre-push hooks.  We use them for checking
and rejecting commits with

* disallowed branches
* non-conformant whitespace
* wrong EOL style
* multiple heads per named branch

As far as I know, neither GH nor BB support such hooks, since they are highly
project-specific.  However, they are only used in cpython and related
repositories, so that doesn't concern migration of doc-only repos.

Sure, you can let the CI run the checks, but that doesn't prohibit merging
and is circumvented by direct pushes to the repository that don't go through
the PR system.  (And please don't make me as a coredev open a PR for every
change.)

>     From my perspective, swapping out Mercurial for git achieves exactly nothing
>     in terms of alleviating the review bottleneck (since the core developers
>     that strongly prefer the git UI will already be using an adapter), and is in
>     fact likely to make it worse by putting the greatest burden in adapting to
>     the change on the folks that are already under the greatest time pressure.
> 
> 
> That's not entirely true. If you are pushing a PR shift in our patch acceptance
> workflow then Bitbucket vs. GitHub isn't fundamentally any different in terms of
> benefit, and I would honestly argue that GitHub's PR experience is better. IOW
> either platform is of equal benefit.

In my opinion, scattering repos over github, bitbucket and hg.python.org is
even less friendly to contributors than a centralized place.  (We are already
approaching this, with pydotorg and infrastructure repos on github.)  So I'm
going to add some thoughts here about migrating the main CPython to git+hub.

We have to consider how well our branch workflow works with the PR
workflow.  There's no gain in the ability to easily merge a PR to one branch
via github when the subsequent merge of 3.x to default/master requires a local
pull/push cycle, as well as the 2.x backport.

As far as I know, you'd have to open a pull/merge request yourself and instantly
merge it, except if there are conflicts between branches, in which case you
are again forced to go local.  I don't need to mention that this doesn't work
well when someone makes a  concurrent commit to the source branch in the
meantime.

And I don't think you'd want to force people to submit 3 pull requests for
the individual branches.

The next point is that there is no easy way to change the target branch of
a pull request (on github or bitbucket).  People will usually make patches
against the master branch unless told differently explicitly, which means
that the pull request will also be against the master branch.  Which means,
close the PR, ask submitter to resubmit against 3.x branch, or do it
yourself.

>     It's also worth keeping in mind that changing the underlying VCS means
>     changing *all* the automation scripts, rather than just updating the
>     configuration settings to reflect a new hosting URL.
> 
> 
> What are the automation scripts there are that would require updating? I would
> like to a list and to have the difficulty of moving them mentioned to know what
> the impact would be.

Compiling that list is likely the most difficult step of updating them.

> So here is what I want to know to focus this discussion:

[...]

> Third, do our release managers care about hg vs. git strongly? They probably use
> the DVCS the most directly and at a lower level by necessity compared to anyone
> else.

I wouldn't care about git or hg.  What I *might* care about is the low-level
access to the repository on hg.python.org, mainly because of the hooks.

> Fourth, do any core developers feel strongly about not using GitHub? Now please
> notice I said "GitHub" and not "git"; I think the proper way to frame this whole
> discussion is we are deciding if we want to switch to Bitbucket or GitHub who
> provide a low-level API for their version control storage service through hg or
> git, respectively. I personally dislike git, but I really like GitHub and I
> don't even notice git there since I use GitHub's OS X app; as I said, I view
> this as choosing a platform and not the underlying DVCS as I have happily chosen
> to access the GitHub hosting service through an app that is not git (it's like
> accessing a web app through it's web page or its REST API).

Github is a nice platform, but many of its features would not be relevant for
the cpython repo (e.g. issue tracker or wiki).  The usefulness of pull requests
is dubious, see above.

The "but it is much easier to contribute simple changes" argument is a bit
simplified: for any nontrivial patch, the time spent on working out the code
should outweight time spent with "hg diff" or "click on pull request".  And
while Travis CI is nice, running relevant tests locally is *much* quicker than
waiting for a full test suite run on a virtualized testing machine.

As for typo fixes, the world does not end when some typos aren't fixed.
Anyway, for the docs we have an explicit offer to send anything, patch or
just suggestion, to docs at python.org, and people do make use of it.  No
github account even required.

> At least for me, until we get a clear understanding of what workflow changes we
> are asking for both contributors and core developers and exactly what work would
> be necessary to update our infrastructure for either Bitbucket or GitHub we
> can't really have a reasonable discussion that isn't going to be full of guessing.

As for svn->hg, a PEP and a champion would be necessary, this time with the
possibility of being rejected.

> And I'm still in support no matter what of breaking out the HOWTOs and the
> tutorial into their own repos for easier updating (having to update the Python
> porting HOWTO in three branches is a pain when it should be consistent across
> Python releases).

I see no problem with that, provided there's a cronjob that syncs the version
in Doc/ to the external version reasonably often.

cheers,
Georg



More information about the Python-Dev mailing list