[core-workflow] My initial thoughts on the steps/blockers of the transition

Brett Cannon brett at python.org
Tue Jan 5 13:03:18 EST 2016


On Mon, 4 Jan 2016 at 21:54 Ezio Melotti <ezio.melotti at gmail.com> wrote:

> On Tue, Jan 5, 2016 at 2:42 AM, Brett Cannon <brett at python.org> wrote:
> > So consider this the starting discussion of the PEP that will be the
> > hg.python.org -> GitHub transition PEP that I will be in charge of.
> Once we
> > have general agreement on the steps necessary I will start the actual PEP
> > and check it in, but I figure there's no point in have a skeleton PEP if
> we
> > can't agree on the skeleton. :) While I list steps influencing all the
> > repos, I want to focus on the ones stopping any repo from moving over for
> > now, expanding what we worry about to the cpython repo as we knock
> blockers
> > down until we move everything over and start adding GitHub perks.
> >
> > The way I see it, we have 4 repos to move: devinabox, benchmarks, peps,
> > devguide, and cpython.
>
> On top of this, there is also the test repo
> (https://hg.python.org/test) and all the tracker repos
> (https://hg.python.org/tracker/).
> I think it would be useful to port the former since it will provide a
> place for devs to try things out and experiment (a new test repo could
> also be created though).
> It would be nice to port the tracker repos too and be consistent with
> the others, but it's not a priority.  When we switched to HG they kept
> being on SVN until I ported them, so I guess the same thing can be
> done (unless R. David or Martin prefer to stick to HG).
>
> > I also think that's essentially the order we should
> > migrate them over. Some things will need to be universally handled
> before we
> > transition a single repo, while other things are only a blocker for some
> of
> > the repos.
> >
> > Universal blockers
> > ==============
> > There are four blockers that must be resolved before we even consider
> moving
> > a repo over. They can be solved in parallel, but they all need to have a
> > selected solution before we can move even the devinabox repo.
> >
> > First, we need to decide how we are going to handle adding all the core
> devs
> > to GitHub. Are we simply going to add all of them to the python
> > organization, or do we want something like a specific python-dev
> gteamroup
> > that gets added to all of the relevant repos? Basically I'm not sure how
> > picky we want to be about the people who already have organization
> access on
> > GitHub about them implicitly getting access to the cpython repo at the
> end
> > of the day (and the same goes for any of the other repos in the python
> > organization). For tracking names, I figure we will create a file in the
> > devguide where people can check in their GitHub usernames and I can
> manually
> > add people as people add themselves to the file.
> >
>
> I think the current list of core-devs should be converted to a group
> and given access to the same repos they have access to now (i.e.
> cpython/devguide/peps and possibly others).  Then additional
> repo-specific groups can be created in case we want to let specific
> contributors work on peps or the devguide.
>

This seems to be the general consensus, so we will create a python-dev team
under the python org and add the core devs there.


>
> > Second, CLA enforcement. As of right now people go to
> > https://www.python.org/psf/contrib/contrib-form/, fill in the form, and
> then
> > Ewa gets an email where she manually flips a flag in Roundup. If we want
> to
> > use a web hook to verify someone has signed a CLA then we need to decide
> > where the ground truth for CLAs are. Do we want to keep using Roundup to
> > manage CLA agreements and thus add a GitHub field in bugs.python.org for
> > people's profile and a web hook or bot that will signal if someone has
> the
> > flag flipped on bugs.python.org?
>
> This can be done.  We can add a "GitHub" username field to Roundup
> users so that we can link the two.
>

OK, so it sounds like we will stick with our current CLA signing flow and
write our own CLA bot that will query Roundup as to whether someone has
signed the CLA or not and then throw up a banner signalling if someone has
(not) signed and an appropriate link to the CLA. That will require some
Roundup work and the creation of the bot.

I should also mention, any bot creations we do should abstract out the code
review tool so that when we change providers again in the future it will be
more straight-forward to just update some select APIs rather than rewrite
every bot we create.


>
>
> > Or is there some prepackaged service that
> > we can use that will keep track of this which would cause us to not use
> > Roundup (which might be easier, but depending on the service require
> > everyone to re-sign)? There's also the issue of supporting people who
> want
> > to submit code by uploading a patch to bugs.python.org but not use
> GitHub.
> > Either way I don't want to have to ask everyone who submits a PR what
> their
> > bugs.python.org username is and then go check that manually.
> >
>
> This also brings up another problem.
> Since the discussion about an issue happens on b.p.o and the PRs are
> submitted on GitHub, this means that:
> 1) users with only a GitHub account have to create a b.p.o account if
> they want to comment on the issue (exclusing review comments);
> 2) users with only a b.p.o account have to create a GitHub account if
> they want to review a PR;
> 3) users with both can comment on b.p.o and review on GitHub, but they
> might need to login twice.
>
> It would be better if users didn't need to create and use two separate
> accounts.
>

If we can add GitHub as a login/creation option for b.p.o accounts then
that solves that. But I'm willing to bet a majority of people will already
have a GitHub account and we have always required the b.p.o account so #1
is the going to be the common case.


>
> > Third, how do we want to do the repo conversions? We need to choose the
> > tool(s) and command(s) that we want to use. There was mention of wanting
> a
> > mapping from hg commit ID to git commit ID. If we have that we could
> have a
> > static bugs.python.org/commit/<ID> page that had the mapping embedded in
> > some JavaScript and if <ID> matched then we just forward them to the
> > corresponding GitHub commit page, otherwise just blindly forward to
> GitHub
> > and assume the ID is git-only, giving us a stable URL for commit web
> views.
> >
>
> As I mentioned on python-committers, we already have
> https://hg.python.org/lookup/ .
> This is currently used to map SVN->HG (e.g.
> https://hg.python.org/lookup/r12345 ), and should be extended to
> handle cs ids too.
> The b.p.o linkifier can just convert all revision numbers and cs ids
> to a https://hg.python.org/lookup/ link and let the lookup page figure
> out where to redirect the user.
>
> > Fourth, for the ancillary repos of devinabox, peps, benchmarks, and
> > devguide, do we care if we use the GitHub merge button for PRs or do we
> want
> > to enforce a linear history with all repos? We just need to decide if
> care
> > about linear histories and then we can move forward since any bot we
> create
> > won't block us from using GitHub.
> >
> > Those four things are enough to move devinabox over. It probably is
> enough
> > for the benchmarks suite, but I have an email to speed@ asking if people
> > want to use this opportunity to re-evaluate the benchmark suite and make
> any
> > changes that will affect repo size (e.g., use pip to pull in the
> libraries
> > and frameworks used by a benchmark rather than vendoring their code,
> making
> > the repo much smaller).
> >
> > Website-related stuff
> > ================
> > This also almost gets us the peps repo, but we do need to figure out how
> to
> > change the website to build from the git checkout rather than an hg one.
> > Same goes for the devguide. It would be great if we can set up web hooks
> to
> > immediately trigger rebuilds of those portions of the sites instead of
> > having to wait until a cronjob triggers.
> >
>
> I think we should make hg.python.org read-only but keep it around and
> in sync with the GitHub repo (either via cronjobs or hooks).  This
> will allow people to contribute using HG in the same way that the
> current GitHub clone allows people to contribute using git.  It will
> also avoid breaking all the tools that currently use hg.python.org
> (and buys us more time to port them if/when needed).
>

That's easy to say, but someone also has to maintain hg.python.org then and
we are doing this move partially to try and cut down on the amount of
custom infrastructure that we maintain. If people are that worried about
others being so adverse to using GitHub that they won't even do an
anonymous clone from their servers then we can get a Bitbucket or GitLab
clone set up, but I would rather try and cut out our repo hosting services
if possible (who knows, maybe we can even finally retire svn.python.org
thanks to shallow clones or something).


>
> > CPython requirements
> > =================
> > There are six things to work out before we move over cpython. First, do
> we
> > want to split out Python 2 branches into their own repo? There might be a
> > clone size benefit which obviously is nice for people on slow Internet
> > connections. It also clearly separates out Python 2 from 3 and lets those
> > who prefer to focus on one compared to the other do that more easily. It
> > does potentially make any single fix that spans 2 and 3 require a bit
> more
> > work since it won't be an intra-clone change. We could also contemplate
> > sub-repos for things like the Doc/ or Tools/ directories (although I
> doubt
> > it's worth it).
> >
>
> I think we should keep 2/3 together.  We could split the stdlib from
> the rest, but that's a separate issue.
>

This seems to be the general consensus, so we will plan to keep cpython as
a single repo.


>
> > Second, do we want all fixes to go into master and then cherry-pick into
> > other branches, or do we want to follow our current practice of going
> into
> > the active feature branch and then merge into master? I personally prefer
> > the former and I think most everyone else does as well, but I thought it
> > should be at least thought about.
> >
>
> Master first and cherry-picking for older branches sounds good to me,
> but I don't know if switching model will have any implications,
> especially while going through the history or using tools like bisect.
>

This seems to be the general consensus, so we will plan to cherry pick
commits into older branches.


>
> > Third, how to handle Misc/NEWS? We can add a NEWS field to
> bugs.python.org
> > and then generate the NEWS file by what issues are tied to what version
> and
> > when they were closed. The other approach is to create a file per NEWS
> entry
> > in a version-specific directory (Larry created code for hg already for
> this
> > to give people an idea: http://bugs.python.org/issue18967). Then when
> we cut
> > a release we run a tool the slurps up all of the relevant files -- which
> > includes files in the directory for the next feature release which
> represent
> > fixes which were cherry picked -- and generates the NEWS file for the
> final
> > release. The per-file approach is bot-friendly and also CLI-friendly, but
> > potentially requires more tooling and I don't know if people feel news
> > entries should be tied to the issue or in the repo (although that assumes
> > people find tweaking Roundup easy :).
> >
> > Fourth, we need to decide exactly what commands we expect core devs to
> run
> > initially for committing code. Since we agreed to a linear history we
> need
> > to document exactly what we expect people to do for a PR to get it into
> > their git repo. This will go into the devguide -- probably will want to
> > start a github branch at some point -- and will act as the commands the
> bot
> > will want to work off of.
> >
>
> I would like to see a complete list of steps from starting to work on
> an issue to having it in the repo, at least to understand the new
> workflow.  This doesn't have to include all the specific commands, but
> at least the basic steps (e.g. after I made a patch to I commit it and
> send a pull request to the main repo, or do I push it to my GitHub
> clone and push a button to send the PR?  Do I need to create a branch
> before I start working on an issue?
>

There will be a step-by-step guide in the devguide to answer all of this
before we make any switch.


>
> > Fifth, what to do about Misc/ACKS? Since we are using PRs, even if we
> > flatten them, I believe the PR creators will get credit in the commit as
> the
> > author while the core dev committing will be flagged as the person doing
> the
> > merge (someone correct me if I'm wrong because if I am this whole point
> is
> > silly). With the commits containing credit directly, we can either
> > automatically generate Misc/ACKS like the NEWS file or simply drop it for
> > future contributors and just leave the file for past contributors since
> git
> > will have kept track for us.
> >
>
> We could keep updating for regular patches with no related PR and add
> a note about all the other GIT contributors (possibly with a git
> command that lists all authors).
> Later on we might decide to have a script that automatically adds all
> the GIT contributors automatically.
>

This seems to be the general consensus, so we will keep Misc/ACKS around
and have a tool that updates it based on git PR commits at release-time.


>
> > Six, we will need to update our Buildbot fleet.
> >
>
> If we keep hg.p.o around and updated, we might not have to do this now
> (even though now is better than never).
>
> > This gets us to the bare minimum needed to function.
> >
> > Parity with hg.python.org
> > ----------------------------------
> > For parity, there are some Roundup integrations that will be necessary,
> like
> > auto-generating links, posting commits to #python-dev on IRC, etc. I'm
> not
> > sure if people want to block until that is all in place or not. I do
> think
> > we should make sure there is some web hook that can take an issue # from
> the
> > title of a PR and automatically posts to the corresponding issue on
> > bugs.python.org that the PR exists. If people disagree then feel free
> to say
> > so.
> >
>
> FWIW I started adding notes to
> https://wiki.python.org/moin/TrackerDevelopmentPlanning to track
> everything that needs to be done on the Roundup side.
> If you prefer I can later move this to the new PEP, but for now I'm
> using it to keep track of all the things that come up in the various
> threads.
>

Nope, the wiki is fine for that sort of thing.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/core-workflow/attachments/20160105/a5c3bf34/attachment-0001.html>


More information about the core-workflow mailing list