core-workflow
Threads by month
- ----- 2024 -----
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
January 2016
- 28 participants
- 10 discussions
Here is the latest version of the PEP. Since no one seems to be bringing up
issues of missing steps or incorrect priorities, I think it's time to start
work! The initial key todos relate to getting the ancillary PEPs moved
over. I've taken on responsibility for writing the CLA bot (the
pre-existing solutions don't seem to be maintained or are locked down to
specific CLA signing solutions). The remaining items are:
- Create a 'python-dev' team
- Define commands to move a Mercurial repository to Git
- Adding GitHub username support to bugs.python.org
- How to update peps webpages from the future Git repo
- How to update the devguide webpages from the future Git repo
If anyone wants to step forward and help, then please do! I just ask you
keep all of us up-to-date on what's going on. And if people want to work on
some other task that's related to the cpython repo then that's fine as
well, but do realize it might be quite a while before your work gets used
(if at all, as things could potentially change).
----------
PEP: 512
Title: Migrating from hg.python.org to GitHub
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon <brett(a)python.org>
Discussions-To: core-workflow(a)python.org
Status: Active
Type: Process
Content-Type: text/x-rst
Created: 17-Jan-2015
Post-History: 17-Jan-2016, 19-Jan-2016, 23-Jan-2016
Abstract
========
This PEP outlines the steps required to migrate Python's development
process from Mercurial [#hg]_ as hosted at
hg.python.org [#h.p.o]_ to Git [#git]_ on GitHub [#GitHub]_. Meeting
the minimum goals of this PEP should allow for the development
process of Python to be as productive as it currently is, and meeting
its extended goals should improve it.
Rationale
=========
In 2014, it became obvious that Python's custom development
process was becoming a hindrance. As an example, for an external
contributor to submit a fix for a bug that eventually was committed,
the basic steps were:
1. Open an issue for the bug at bugs.python.org [#b.p.o]_.
2. Checkout out the CPython source code from hg.python.org [#h.p.o]_.
3. Make the fix.
4. Upload a patch.
5. Have a core developer review the patch using our fork of the
Rietveld code review tool [#rietveld]_.
6. Download the patch to make sure it still applies cleanly.
7. Run the test suite manually.
8. Commit the change manually.
9. If the change was for a bugfix release, merge into the
in-development branch.
10. Run the test suite manually again.
11. Commit the merge.
12. Push the changes.
This is a very heavy, manual process for core developers. Even in the
simple case, you could only possibly skip the code review step, as you
would still need to build the documentation. This led to patches
languishing on the issue tracker due to core developers not being
able to work through the backlog fast enough to keep up with
submissions. In turn, that led to a side-effect issue of discouraging
outside contribution due to frustration from lack of attention, which
is dangerous problem for an open source project as it runs counter to
having a viable future for the project. Simply accepting patches
uploaded to bugs.python.org [#b.p.o]_ is potentially simple for an
external contributor, it is as slow and burdensome as it gets for
a core developer to work with.
Hence the decision was made in late 2014 that a move to a new
development process was needed. A request for PEPs
proposing new workflows was made, in the end leading to two:
PEP 481 and PEP 507 proposing GitHub [#github]_ and
GitLab [#gitlab]_, respectively.
The year 2015 was spent off-and-on working on those proposals and
trying to tease out details of what made them different from each
other on the core-workflow mailing list [#core-workflow]_.
PyCon US 2015 also showed that the community was a bit frustrated
with our process due to both cognitive overhead for new contributors
and how long it was taking for core developers to
look at a patch (see the end of Guido van Rossum's
keynote at PyCon US 2015 [#guido-keynote]_ as an example of the
frustration).
On January 1, 2016, the decision was made by Brett Cannon to move the
development process to GitHub. The key reasons for choosing GitHub
were [#reasons]_:
* Maintaining custom infrastructure has been a burden on volunteers
(e.g., a custom fork of Rietveld [#rietveld]_
that is not being maintained is currently being used).
* The custom workflow is very time-consuming for core developers
(not enough automated tooling built to help support it).
* The custom workflow is a hindrance to external contributors
(acts as a barrier of entry due to time required to ramp up on
development process).
* There is no feature differentiating GitLab from GitHub beyond
GitLab being open source.
* Familiarity with GitHub is far higher amongst core developers and
external contributors than with GitLab.
* Our BDFL prefers GitHub (who would be the first person to tell
you that his opinion shouldn't matter, but the person making the
decision felt it was important that the BDFL feel comfortable with
the workflow of his own programming language to encourage his
continued participation).
There's even already an unofficial image to use to represent the
migration to GitHub [#pythocat]_.
The overarching goal of this migration is to improve the development
process to the extent that a core developer can go from external
contribution submission through all the steps leading to committing
said contribution all from within a browser on a tablet with WiFi
using *some* development process (this does not inherently mean
GitHub's default workflow). All of this will be done in such a way
that if an external contributor chooses not to use GitHub then they
will continue to have that option.
Repositories to Migrate
=======================
While hg.python.org [#h.p.o]_ hosts many repositories, there are only
five key repositories that should move:
1. devinabox [#devinabox-repo]_
2. benchmarks [#benchmarks-repo]_
3. peps [#peps-repo]_
4. devguide [#devguide-repo]_
5. cpython [#cpython-repo]_
The devinabox and benchmarks repositories are code-only.
The peps and devguide repositories involve the generation of webpages.
And the cpython repository has special requirements for integration
with bugs.python.org [#b.p.o]_.
Migration Plan
==============
The migration plan is separated into sections based on what is
required to migrate the repositories listed in the
`Repositories to Migrate`_ section. Completion of requirements
outlined in each section should unblock the migration of the related
repositories. The sections are expected to be completed in order, but
not necessarily the requirements within a section.
Requirements for Code-Only Repositories
---------------------------------------
Completion of the requirements in this section will allow the
devinabox and benchmarks repositories to move to
GitHub. While devinabox has a sufficiently descriptive name, the
benchmarks repository does not; therefore, it will be named
"python-benchmark-suite".
Create a 'python-dev' team
''''''''''''''''''''''''''
To manage permissions, a 'python-dev' team will be created as part of
the python organization [#github-python-org]_. Any repository that is
moved will have the 'python-dev' team added to it with write
permissions [#github-org-perms]_. Anyone who previously had rights to
manage SSH keys on hg.python.org will become a team maintainer for the
'python-dev' team.
Define commands to move a Mercurial repository to Git
'''''''''''''''''''''''''''''''''''''''''''''''''''''
Since moving to GitHub also entails moving to Git [#git]_, we must
decide what tools and commands we will run to translate a Mercurial
repository to Git. The exact tools and steps to use are an
open issue; see `Tools and commands to move from Mercurial to Git`_.
CLA enforcement
'''''''''''''''
A key part of any open source project is making sure that its source
code can be properly licensed. This requires making sure all people
making contributions have signed a contributor license agreement
(CLA) [#cla]_. Up until now, enforcement of CLA signing of
contributed code has been enforced by core developers checking
whether someone had an ``*`` by their username on
bugs.python.org [#b.p.o]_. With this migration, the plan is to start
off with automated checking and enforcement of contributors signing
the CLA.
Adding GitHub username support to bugs.python.org
+++++++++++++++++++++++++++++++++++++++++++++++++
To keep tracking of CLA signing under the direct control of the PSF,
tracking who has signed the PSF CLA will be continued by marking that
fact as part of someone's bugs.python.org user profile. What this
means is that an association will be needed between a person's
bugs.python.org [#b.p.o]_ account and their GitHub account, which
will be done through a new field in a user's profile.
This does implicitly require that contributors will need both a
GitHub [#github]_ and bugs.python.org account in order to sign the
CLA and contribute through GitHub.
A bot to enforce CLA signing
++++++++++++++++++++++++++++
With an association between someone's GitHub account and their
bugs.python.org [#b.p.o]_ account, which has the data as to whether
someone has signed the CLA, a bot can monitor pull requests on
GitHub and denote whether the contributor has signed the CLA.
If the user has signed the CLA, the bot will add a positive label to
the issue to denote the pull request has no CLA issues (e.g., a green
label stating, "CLA: ✓"). If the contributor has not signed a CLA,
a negative label will be added to the pull request will be blocked
using GitHub's status API (e.g., a red label stating, "CLA: ✗"). If a
contributor lacks a bugs.python.org account, that will lead to
another label (e.g., "CLA: ✗ (no account)"). Using a label for both
positive and negative cases provides a fallback notification if the
bot happens to fail, preventing potential false-positives or
false-negatives. It also allows for an easy way to trigger the bot
again by simply removing a CLA-related label.
If no pre-existing, maintained bot exists that fits our needs, one
will be written from scratch. It will be hosted on Heroku [#heroku]_
and written to target Python 3.5 to act as a showcase for
asynchronous programming. The bot's actual name is an open issue:
`Naming the bots`_
Requirements for Web-Related Repositories
-----------------------------------------
Due to their use for generating webpages, the
devguide [#devguide-repo]_ and peps [#peps-repo]_ repositories need
their respective processes updated to pull from their new Git
repositories.
The devguide repository might also need to be named
``python-devguide`` to make sure the repository is not ambiguous
when viewed in isolation from the
python organization [#github-python-org]_.
Requirements for the cpython Repository
---------------------------------------
Obviously the most active and important repository currently hosted
at hg.python.org [#h.p.o]_ is the cpython
repository [#cpython-repo]_. Because of its importance and high-
frequency use, it requires more tooling before being moved to GitHub
compared to the other repositories mentioned in this PEP.
Document steps to commit a pull request
'''''''''''''''''''''''''''''''''''''''
During the process of choosing a new development workflow, it was
decided that a linear history is desired. People preferred having a
single commit representing a single change instead of having a set of
unrelated commits lead to a merge commit that represented a single
change. This means that the convenient "Merge" button in GitHub pull
requests is undesirable, as it creates a merge commit along with all
of the contributor's individual commits (this does not affect the
other repositories where the desire for a linear history doesn't
exist).
Luckily, Git [#git]_ does not require GitHub's workflow and so one can
be chosen which gives us a linear history by using Git's CLI. The
expectation is that all pull requests will be fast-forwarded and
rebased before being pushed to the master repository. This should
give proper attribution to the pull request author in the Git
history. This does have the consequence of losing some GitHub
features such as automatic closing of pull requests, link generation,
etc.
A second set of recommended commands will also be written for
committing a contribution from a patch file uploaded to
bugs.python.org [#b.p.o]_. This will obviously help keep the linear
history, but it will need to be made to have attribution to the patch
author.
The exact sequence of commands that will be given as guidelines to
core developers is an open issue:
`Git CLI commands for committing a pull request to cpython`_.
Handling Misc/NEWS
''''''''''''''''''
Traditionally the ``Misc/NEWS`` file [#news-file]_ has been problematic
for changes which spanned Python releases. Often times there will be
merge conflicts when committing a change between e.g., 3.5 and 3.6
only in the ``Misc/NEWS`` file. It's so common, in fact, that the
example instructions in the devguide explicitly mention how to
resolve conflicts in the ``Misc/NEWS`` file
[#devguide-merge-across-branches]_. As part of our tool
modernization, working with the ``Misc/NEWS`` file will be
simplified.
There are currently two competing approaches to solving the
``Misc/NEWS`` problem which are discussed in an open issue:
`How to handle the Misc/NEWS file`_.
Handling Misc/ACKS
''''''''''''''''''
Traditionally the ``Misc/ACKS`` file [#acks-file]_ has been managed
by hand. But thanks to Git supporting an ``author`` value as well as
a ``committer`` value per commit, authorship of a commit can be part
of the history of the code itself.
As such, manual management of ``Misc/ACKS`` will become optional. A
script will be written that will collect all author and committer
names and merge them into ``Misc/ACKS`` with all of the names listed
prior to the move to Git. Running this script will become part of the
release process.
Linking pull requests to issues
'''''''''''''''''''''''''''''''
Historically, external contributions were attached to an issue on
bugs.python.org [#b.p.o]_ thanks to the fact that all external
contributions were uploaded as a file. For changes committed by a
core developer who committed a change directly, the specifying of an
issue number in the commit message of the format ``Issue #`` at the
start of the message led to a comment being posted to the issue
linking to the commit.
Linking a pull request to an issue
++++++++++++++++++++++++++++++++++
An association between a pull request and an issue is needed to track
when a fix has been proposed. The association needs to be many-to-one
as there can take multiple pull requests to solve a single issue
(technically it should be a many-to-many association for when a
single fix solves multiple issues, but this is fairly rare and issues
can be merged into one using the ``Superseder`` field on the issue
tracker).
Association between a pull request and an issue will be done based on
detecting the regular expression``[Ii]ssue #(?P<bpo_id>\d+)``. If
this is specified in either the title or in the body of a message on
a pull request then connection will be made on
bugs.python.org [#b.p.o]_. A label will also be added to the pull
request to signify that the connection was made successfully. This
could lead to incorrect associations if the wrong issue or
referencing another issue was done, but these are rare occasions.
Notify the issue if the pull request is committed
+++++++++++++++++++++++++++++++++++++++++++++++++
Once a pull request is closed (merged or not), the issue should be
updated to reflect this fact.
Update linking service for mapping commit IDs to URLs
'''''''''''''''''''''''''''''''''''''''''''''''''''''
Currently you can use https://hg.python.org/lookup/ with a revision
ID from either the Subversion or Mercurial copies of the
cpython repo [#cpython-repo]_ to get redirected to the URL for that
revision in the Mercurial repository. The URL rewriter will need to
be updated to redirect to the Git repository and to support the new
revision IDs created for the Git repository.
Create https://git.python.org
'''''''''''''''''''''''''''''
Just as hg.python.org [#h.p.o]_ currently points to the Mercurial
repository for Python, git.python.org should do the equivalent for
the Git repository.
Backup of pull request data
'''''''''''''''''''''''''''
Since GitHub [#github]_ is going to be used for code hosting and code
review, those two things need to be backed up. In the case of code
hosting, the backup is implicit as all non-shallow Git [#git]_ clones
contain the full history of the repository, hence there will be many
backups of the repository.
The code review history does not have the same implicit backup
mechanism as the repository itself. That means a daily backup of code
review history should be done so that it is not lost in case of any
issues with GitHub. It also helps guarantee that a migration from
GitHub to some other code review system is feasible were GitHub to
disappear overnight.
Deprecate sys._mercurial
''''''''''''''''''''''''
Once Python is no longer kept in Mercurial, the ``sys._mercurial``
attribute will need to be changed to return ``('CPython', '', '')``.
An equivalent ``sys._git`` attribute will be added which fulfills the
same use-cases.
Update the devguide
'''''''''''''''''''
The devguide will need to be updated with details of the new
workflow. Mostly likely work will take place in a separate branch
until the migration actually occurs.
Update PEP 101
''''''''''''''
The release process will need to be updated as necessary.
Optional, Planned Features
--------------------------
Once the cpython repository [#cpython-repo]_ is migrated, all
repositories will have been moved to GitHub [#github]_ and the
development process should be on equal footing as before. But a key
reason for this migration is to improve the development process,
making it better than it has ever been. This section outlines some
plans on how to improve things.
It should be mentioned that overall feature planning for
bugs.python.org [#b.p.o]_ -- which includes plans independent of this
migration -- are tracked on their own wiki page [#tracker-plans]_.
Bot to handle pull request merging
''''''''''''''''''''''''''''''''''
As stated in the section entitled
"`Document steps to commit a pull request`_", the desire is to
maintain a linear history for cpython. Unfortunately,
Github's [#github]_ web-based workflow does not support a linear
history. Because of this, a bot should be written to substitute for
GitHub's in-browser commit abilities.
To start, the bot should accept commands to commit a pull request
against a list of branches. This allows for committing a pull request
that fixes a bug in multiple versions of Python.
More advanced features such as a commit queue can come later. This
would linearly apply accepted pull requests and verify that the
commits did not interfere with each other by running the test suite
and backing out commits if the test run failed. To help facilitate
the speed of testing, all patches committed since the last test run
can be applied and run in a single test run as the optimistic
assumption is that the patches will work in tandem. Some mechanism to
re-run the tests in case of test flakiness will be needed, whether it
is from removing a "test failed" label, web interface for core
developers to trigger another testing event, etc.
Inspiration or basis of the bot could be taken from pre-existig bots
such as Homu [#homu]_ or Zuul [#zuul]_.
The name given to this bot in order to give it commands is an open
issue: `Naming the bots`_.
Continuous integration per pull request
'''''''''''''''''''''''''''''''''''''''
To help speed up pull request approvals, continuous integration
testing should be used. This helps mitigate the need for a core
developer to download a patch simply to run the test suite against
the patch.
Which free CI service to use is an open issue:
`Choosing a CI service`_.
Test coverage report
''''''''''''''''''''
Getting an up-to-date test coverage report for Python's standard
library would be extremely beneficial as generating such a report can
take quite a while to produce.
There are a couple pre-existing services that provide free test
coverage for open source projects. Which option is best is an open
issue: `Choosing a test coverage service`_.
Notifying issues of pull request comments
'''''''''''''''''''''''''''''''''''''''''
The current development process does not include notifying an issue
on bugs.python.org [#b.p.o]_ when a review comment is left on
Rietveld [#rietveld]_. It would be nice to fix this so that people
can subscribe only to comments at bugs.python.org and not
GitHub [#github]_ and yet still know when something occurs on GitHub
in terms of review comments on relevant pull requests. Current
thinking is to post a comment to bugs.python.org to the relevant
issue when at least one review comment has been made over a certain
period of time (e.g., 15 or 30 minutes). This keeps the email volume
down for those that receive both GitHub and bugs.python.org email
notifications while still making sure that those only following
bugs.python.org know when there might be a review comment to address.
Allow bugs.python.org to use GitHub as a login provider
'''''''''''''''''''''''''''''''''''''''''''''''''''''''
As of right now, bugs.python.org [#b.p.o]_ allows people to log in
using Google, Launchpad, or OpenID credentials. It would be good to
expand this to GitHub credentials.
Web hooks for re-generating web content
'''''''''''''''''''''''''''''''''''''''
The content at https://docs.python.org/,
https://docs.python.org/devguide, and
https://www.python.org/dev/peps/ are all derived from files kept in
one of the repositories to be moved as part of this migration. As
such, it would be nice to set up appropriate webhooks to trigger
rebuilding the appropriate web content when the files they are based
on change instead of having to wait for, e.g., a cronjob to trigger.
Link web content back to files that it is generated from
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
It would be helpful for people who find issues with any of the
documentation that is generated from a file to have a link on each
page which points back to the file on GitHub [#github]_ that stores
the content of the page. That would allow for quick pull requests to
fix simple things such as spelling mistakes.
Splitting out parts of the documentation into their own repositories
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
While certain parts of the documentation at https://docs.python.org
change with the code, other parts are fairly static and are not
tightly bound to the CPython code itself. The following sections of
the documentation fit this category of slow-changing,
loosely-coupled:
* `Tutorial <https://docs.python.org/3/tutorial/index.html>`__
* `Python Setup and Usage <https://docs.python.org/3/using/index.html>`__
* `HOWTOs <https://docs.python.org/3/howto/index.html>`__
* `Installing Python Modules <
https://docs.python.org/3/installing/index.html>`__
* `Distributing Python Modules <
https://docs.python.org/3/distributing/index.html>`__
* `Extending and Embedding <https://docs.python.org/3/extending/index.html
>`__
* `FAQs <https://docs.python.org/3/faq/index.html>`__
These parts of the documentation could be broken out into their own
repositories to simplify their maintenance and to expand who has
commit rights to them to ease in their maintenance.
It has also been suggested to split out the
`What's New <https://docs.python.org/3/whatsnew/index.html>`__
documents. That would require deciding whether a workflow could be
developed where it would be difficult to forget to update
What's New (potentially through a label added to PRs, like
"What's New needed").
Backup of Git repositories
''''''''''''''''''''''''''
While not necessary, it would be good to have official backups of the
various Git repositories for disaster protection. It will be up to
the PSF infrastructure committee to decide if this is worthwhile or
unnecessary.
Status
======
Requirements for migrating the devinabox [#devinabox-repo]_ and
benchmarks [#benchmarks-repo]_ repositories:
* Not started
- `Create a 'python-dev' team`_
- `Define commands to move a Mercurial repository to Git`_
- `Adding GitHub username support to bugs.python.org`_
* In progress
- `A bot to enforce CLA signing`_:
https://github.com/brettcannon/knights-who-say-ni (Brett Cannon)
* Completed
- None
Repositories whose build steps need updating:
* Not started
- peps [#peps-repo]_
- devguide [#devguide-repo]_
* In progress
- None
* Completed
- None
Requirements to move over the cpython repo [#cpython-repo]_:
* Not started
- `Document steps to commit a pull request`_
- `Handling Misc/NEWS`_
- `Handling Misc/ACKS`_
- `Linking a pull request to an issue`_
- `Notify the issue if the pull request is committed`_
- `Update linking service for mapping commit IDs to URLs`_
- `Create https://git.python.org`_
- `Backup of pull request data`_
- `Deprecate sys._mercurial`_
- `Update the devguide`_
- `Update PEP 101`_
* In progress
- None
* Completed
- None
Optional features:
* Not started
- `Bot to handle pull request merging`_
- `Continuous integration per pull request`_
- `Test coverage report`_
- `Notifying issues of pull request comments`_
- `Allow bugs.python.org to use GitHub as a login provider`_
- `Web hooks for re-generating web content`_
- `Link web content back to files that it is generated from`_
- `Splitting out parts of the documentation into their own repositories`_
- `Backup of Git repositories`_
* In progress
- None
* Completed
- None
Open Issues
===========
For this PEP, open issues are ones where a decision needs to be made
to how to approach or solve a problem. Open issues do not entail
coordination issues such as who is going to write a certain bit of
code.
The fate of hg.python.org
-------------------------
With the code repositories moving over to Git [#git]_, there is no
technical need to keep hg.python.org [#h.p.o]_ running. Having said
that, some in the community would like to have it stay functioning as
a Mercurial [#hg]_ mirror of the Git repositories. Others have said
that they still want a mirror, but one using Git.
As maintaining hg.python.org is not necessary, it will be up to the
PSF infrastructure committee to decide if they want to spend the
time and resources to keep it running. They may also choose whether
they want to host a Git mirror on PSF infrastructure.
Depending on the decision reached, other ancillary repositories will
either be forced to migration or they can choose to simply stay on
hg.python.org.
Tools and commands to move from Mercurial to Git
------------------------------------------------
A decision needs to be made on exactly what tooling and what commands
involving those tools will be used to convert a Mercurial repository
to Git. Currently a suggestion has been made to use
https://github.com/frej/fast-export. Another suggestion is to use
https://github.com/felipec/git-remote-hg. Finally,
http://hg-git.github.io/ has been suggested.
Git CLI commands for committing a pull request to cpython
---------------------------------------------------------
Because Git [#git]_ may be a new version control system for core
developers, the commands people are expected to run will need to be
written down. These commands also need to keep a linear history while
giving proper attribution to the pull request author.
Another set of commands will also be necessary for when working with
a patch file uploaded to bugs.python.org [#b.p.o]_. Here the linear
history will be kept implicitly, but it will need to make sure to
keep/add attribution.
How to handle the Misc/NEWS file
--------------------------------
There are three competing approaches to handling
``Misc/NEWS`` [#news-file]_. One is to add a news entry for issues on
bugs.python.org [#b.p.o]_. This would mean an issue that is marked
as "resolved" could not be closed until a news entry is added in the
"news" field in the issue tracker. The benefit of tying the news
entry to the issue is it makes sure that all changes worthy of a news
entry have an accompanying issue. It also makes classifying a news
entry automatic thanks to the Component field of the issue. The
Versions field of the issue also ties the news entry to which Python
releases were affected. A script would be written to query
bugs.python.org for relevant new entries for a release and to produce
the output needed to be checked into the code repository. This
approach is agnostic to whether a commit was done by CLI or bot.
A competing approach is to use an individual file per news entry,
containing the text for the entry. In this scenario each feature
release would have its own directory for news entries and a separate
file would be created in that directory that was either named after
the issue it closed or a timestamp value (which prevents collisions).
Merges across branches would have no issue as the news entry file
would still be uniquely named and in the directory of the latest
version that contained the fix. A script would collect all news entry
files no matter what directory they reside in and create an
appropriate news file (the release directory can be ignored as the
mere fact that the file exists is enough to represent that the entry
belongs to the release). Classification can either be done by keyword
in the new entry file itself or by using subdirectories representing
each news entry classification in each release directory (or
classification of news entries could be dropped since critical
information is captured by the "What's New" documents which are
organized). The benefit of this approach is that it keeps the changes
with the code that was actually changed. It also ties the message to
being part of the commit which introduced the change. For a commit
made through the CLI, a script will be provided to help generate the
file. In a bot-driven scenario, the merge bot will have a way to
specify a specific news entry and create the file as part of its
flattened commit (while most likely also supporting using the first
line of the commit message if no specific news entry was specified).
Code for this approach has been written previously for the Mercurial
workflow at http://bugs.python.org/issue18967. There is also tools
from the community like https://pypi.python.org/pypi/towncrier and
https://github.com/twisted/newsbuilder .
A yet third option is a merge script to handle the conflicts. This
approach allows for keeping the NEWS file as a single file. It does
run the risk, though, of failure and thus blocking a commit until it
can be manually resolved.
Naming the bots
---------------
As naming things can lead to bikeshedding of epic proportions, Brett
Cannon will choose the final name of the various bots (the name of
the project for the bots themselves can be anything, this is purely
for the name used in giving commands to the bot or the account name).
The names will come from Monty Python, which is only fitting since
Python is named after the comedy troupe. They will most likely come
from 'Monty Python and the Holy Grail' [#holy-grail]_ (which happens
to be how Brett was introduced to Monty Python). Current ideas on the
name include:
"Black Knight" sketch [#black-knight-sketch]_:
* black-knight
* none-shall-pass
* just-a-flesh-wound
"Bridge of Death" sketch [#bridge-of-death-sketch]_:
* bridge-keeper
* man-from-scene-24
* five-questions
* what-is-your-quest
* blue-no-green
* air-speed-velocity
* your-favourite-colour
(and that specific spelling; Monty Python is British, after all)
"Killer rabbit" sketch [#killer-rabbit-sketch]_:
* killer-rabbit
* holy-hand-grenade
* 5-is-right-out
"French Taunter" sketch [#french-taunter-sketch]_:
* elderberries
* kanigget
"Constitutional Peasants" sketch [#constitutional-peasants-sketch]_:
* dennis
* from-the-masses
"Knights Who Say 'Ni'" sketch [#ni-sketch]_:
* shubbery
* ni
* knights-who-say-ni
>From "Monty Python and the Holy Grail" in general:
* brave-sir-robin
Choosing a CI service
---------------------
There are various CI services that provide free support for open
source projects hosted on GitHub [#github]_. Two such examples are
Travis [#travis]_ and Codeship [#codeship]_. Whatever solution is
chosen will need to not time out in the time it takes to execute
Python's test suite. It should optimally provide access to multiple C
compilers for more thorough testing. Network access is also
beneficial.
The current CI service for Python is Pypatcher [#pypatcher]_. A
request can be made in IRC to try a patch from
bugs.python.org [#b.p.o]_. The results can be viewed at
https://ci.centos.org/job/cPython-build-patch/ .
Choosing a test coverage service
--------------------------------
Getting basic test coverage of Python's standard library can be
created simply by using coverage.py [#coverage]_. Getting
thorough test coverage is actually quite tricky, with the details
outlined in the devinabox's README [#devinabox-repo]_. It would be
best if a service could be found that would allow for thorough test
coverage, but it might not be feasible.
Free test coverage services include Coveralls [#coveralls]_ and
Codecov [#codecov]_.
Rejected Ideas
==============
Separate Python 2 and Python 3 repositories
-------------------------------------------
It was discussed whether separate repositories for Python 2 and
Python 3 were desired. The thinking was that this would shrink the
overall repository size which benefits people with slow Internet
connections or small bandwidth caps.
In the end it was decided that it was easier logistically to simply
keep all of CPython's history in a single repository.
Commit multi-release changes in bugfix branch first
---------------------------------------------------
As the current development process has changes committed in the
oldest branch first and then merged up to the default branch, the
question came up as to whether this workflow should be perpetuated.
In the end it was decided that committing in the newest branch and
then cherry-picking changes into older branches would work best as
most people will instinctively work off the newest branch and it is a
more common workflow when using Git [#git]_.
Cherry-picking is also more bot-friendly for an in-browser workflow.
In the merge-up scenario, if you were to request a bot to do a merge
and it failed, then you would have to make sure to immediately solve
the merge conflicts if you still allowed the main commit, else you
would need to postpone the entire commit until all merges could be
handled. With a cherry-picking workflow, the main commit could
proceed while postponing the merge-failing cherry-picks. This allows
for possibly distributing the work of managing conflicting merges.
Deriving ``Misc/NEWS`` from the commit logs
-------------------------------------------
As part of the discussion surrounding `Handling Misc/NEWS`_, the
suggestion has come up of deriving the file from the commit logs
itself. In this scenario, the first line of a commit message would be
taken to represent the news entry for the change. Some heuristic to
tie in whether a change warranted a news entry would be used, e.g.,
whether an issue number is listed.
This idea has been rejected due to some core developers preferring to
write a news entry separate from the commit message. The argument is
the first line of a commit message compared to that of a news entry
have different requirements in terms of brevity, what should be said,
etc.
References
==========
.. [#h.p.o] https://hg.python.org
.. [#GitHub] GitHub (https://github.com)
.. [#hg] Mercurial (https://www.mercurial-scm.org/)
.. [#git] Git (https://git-scm.com/)
.. [#b.p.o] https://bugs.python.org
.. [#rietveld] Rietveld (https://github.com/rietveld-codereview/rietveld)
.. [#gitlab] GitLab (https://about.gitlab.com/)
.. [#core-workflow] core-workflow mailing list (
https://mail.python.org/mailman/listinfo/core-workflow)
.. [#guido-keynote] Guido van Rossum's keynote at PyCon US (
https://www.youtube.com/watch?v=G-uKNd5TSBw)
.. [#reasons] Email to core-workflow outlining reasons why GitHub was
selected
(https://mail.python.org/pipermail/core-workflow/2016-January/000345.html
)
.. [#benchmarks-repo] Mercurial repository for the Unified Benchmark Suite
(https://hg.python.org/benchmarks/)
.. [#devinabox-repo] Mercurial repository for devinabox (
https://hg.python.org/devinabox/)
.. [#peps-repo] Mercurial repository of the Python Enhancement Proposals (
https://hg.python.org/peps/)
.. [#devguide-repo] Mercurial repository for the Python Developer's Guide (
https://hg.python.org/devguide/)
.. [#cpython-repo] Mercurial repository for CPython (
https://hg.python.org/cpython/)
.. [#github-python-org] Python organization on GitHub (
https://github.com/python)
.. [#github-org-perms] GitHub repository permission levels
(
https://help.github.com/enterprise/2.4/user/articles/repository-permission-…
)
.. [#cla] Python Software Foundation Contributor Agreement (
https://www.python.org/psf/contrib/contrib-form/)
.. [#news-file] ``Misc/NEWS`` (
https://hg.python.org/cpython/file/default/Misc/NEWS)
.. [#acks-file] ``Misc/ACKS`` (
https://hg.python.org/cpython/file/default/Misc/ACKS)
.. [#devguide-merge-across-branches] Devguide instructions on how to merge
across branches
(
https://docs.python.org/devguide/committing.html#merging-between-different-…
)
.. [#pythocat] Pythocat (https://octodex.github.com/pythocat/)
.. [#tracker-plans] Wiki page for bugs.python.org feature development
(https://wiki.python.org/moin/TrackerDevelopmentPlanning)
.. [#black-knight-sketch] The "Black Knight" sketch from "Monty Python and
the Holy Grail"
(https://www.youtube.com/watch?v=dhRUe-gz690)
.. [#bridge-of-death-sketch] The "Bridge of Death" sketch from "Monty
Python and the Holy Grail"
(https://www.youtube.com/watch?v=cV0tCphFMr8)
.. [#holy-grail] "Monty Python and the Holy Grail" sketches
(https://www.youtube.com/playlist?list=PL-Qryc-SVnnu1MvN3r94Y9atpaRuIoGmp
)
.. [#killer-rabbit-sketch] "Killer rabbit" sketch from "Monty Python and
the Holy Grail"
(
https://www.youtube.com/watch?v=Nvs5pqf-DMA&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#french-taunter-sketch] "French Taunter" from "Monty Python and the
Holy Grail"
(
https://www.youtube.com/watch?v=A8yjNbcKkNY&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#constitutional-peasants-sketch] "Constitutional Peasants" from "Monty
Python and the Holy Grail"
(
https://www.youtube.com/watch?v=JvKIWjnEPNY&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#ni-sketch] "Knights Who Say Ni" from "Monty Python and the Holy Grail"
(
https://www.youtube.com/watch?v=zIV4poUZAQo&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#homu] Homu (http://homu.io/)
.. [#zuul] Zuul (http://docs.openstack.org/infra/zuul/)
.. [#travis] Travis (https://travis-ci.org/)
.. [#codeship] Codeship (https://codeship.com/)
.. [#coverage] coverage.py (https://pypi.python.org/pypi/coverage)
.. [#coveralls] Coveralls (https://coveralls.io/)
.. [#codecov] Codecov (https://codecov.io/)
.. [#pypatcher] Pypatcher (https://github.com/kushaldas/pypatcher)
.. [#heroku] Heroku (https://www.heroku.com/)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
1
0
I have tried to take into account all of the corrections and suggestions
people have made. If you want to see exactly what has changed since the
last post, look at
https://github.com/brettcannon/github-transition-pep/blob/master/pep-0512.r…
and
its commit history.
In the last round of discussions, the most divisive thing was the switch to
a cherry-picking workflow over the current merge-up one. I tried to flesh
out the explanation for the planned change, but if people still want to
discuss it then someone who wants to see it kept to a merge-up workflow
needs to start a separate thread on that topic.
Otherwise my hope is we can agree that the PEP covers what needs to be done
soon so that we can start prioritizing work and then finding people to work
on various things.
And if you do reply to this email, please trim out what you don't comment
on as the length of this PEP is such that after one reply the emails start
getting held for moderation due to email size.
----------
PEP: 512
Title: Migrating from hg.python.org to GitHub
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon <brett(a)python.org>
Status: Active
Type: Process
Content-Type: text/x-rst
Created:
Post-History: 17-Jan-2015, 19-Jan-2015
Abstract
========
This PEP outlines the steps required to migrate Python's development
process from Mercurial [#hg]_ as hosted at
hg.python.org [#h.p.o]_ to Git [#git]_ on GitHub [#GitHub]_. Meeting
the minimum goals of this PEP should allow for the development
process of Python to be as productive as it currently is, and meeting
its extended goals should improve it.
Rationale
=========
In 2014, it became obvious that Python's custom development
process was becoming a hindrance. As an example, for an external
contributor to submit a fix for a bug that eventually was committed,
the basic steps were:
1. Open an issue for the bug at bugs.python.org [#b.p.o]_.
2. Checkout out the CPython source code from hg.python.org [#h.p.o]_.
3. Make the fix.
4. Upload a patch.
5. Have a core developer review the patch using our fork of the
Rietveld code review tool [#rietveld]_.
6. Download the patch to make sure it still applies cleanly.
7. Run the test suite manually.
8. Commit the change manually.
9. If the change was for a bugfix release, merge into the
in-development branch.
10. Run the test suite manually again.
11. Commit the merge.
12. Push the changes.
This is a very heavy, manual process for core developers. Even in the
simple case, you could only possibly skip the code review step, as you
would still need to build the documentation. This led to patches
languishing on the issue tracker due to core developers not being
able to work through the backlog fast enough to keep up with
submissions. In turn, that led to a side-effect issue of discouraging
outside contribution due to frustration from lack of attention, which
is dangerous problem for an open source project as it runs counter to
having a viable future for the project. Simply accepting patches
uploaded to bugs.python.org [#b.p.o]_ is potentially simple for an
external contributor, it is as slow and burdensome as it gets for
a core developer to work with.
Hence the decision was made in late 2014 that a move to a new
development process was needed. A request for PEPs
proposing new workflows was made, in the end leading to two:
PEP 481 and PEP 507 proposing GitHub [#github]_ and
GitLab [#gitlab]_, respectively.
The year 2015 was spent off-and-on working on those proposals and
trying to tease out details of what made them different from each
other on the core-workflow mailing list [#core-workflow]_.
PyCon US 2015 also showed that the community was a bit frustrated
with our process due to both cognitive overhead for new contributors
and how long it was taking for core developers to
look at a patch (see the end of Guido van Rossum's
keynote at PyCon US 2015 [#guido-keynote]_ as an example of the
frustration).
On January 1, 2016, the decision was made by Brett Cannon to move the
development process to GitHub. The key reasons for choosing GitHub
were [#reasons]_:
* Maintaining custom infrastructure has been a burden on volunteers
(e.g., a custom fork of Rietveld [#rietveld]_
that is not being maintained is currently being used).
* The custom workflow is very time-consuming for core developers
(not enough automated tooling built to help support it).
* The custom workflow is a hindrance to external contributors
(acts as a barrier of entry due to time required to ramp up on
development process).
* There is no feature differentiating GitLab from GitHub beyond
GitLab being open source.
* Familiarity with GitHub is far higher amongst core developers and
external contributors than with GitLab.
* Our BDFL prefers GitHub (who would be the first person to tell
you that his opinion shouldn't matter, but the person making the
decision felt it was important that the BDFL feel comfortable with
the workflow of his own programming language to encourage his
continued participation).
There's even already an unofficial image to use to represent the
migration to GitHub [#pythocat]_.
The overarching goal of this migration is to improve the development
process to the extent that a core developer can go from external
contribution submission through all the steps leading to committing
said contribution all from within a browser on a tablet with WiFi
using *some* development process (this does not inherently mean
GitHub's default workflow). All of this will be done in such a way
that if an external contributor chooses not to use GitHub then they
will continue to have that option.
Repositories to Migrate
=======================
While hg.python.org [#h.p.o]_ hosts many repositories, there are only
five key repositories that should move:
1. devinabox [#devinabox-repo]_
2. benchmarks [#benchmarks-repo]_
4. peps [#peps-repo]_
5. devguide [#devguide-repo]_
6. cpython [#cpython-repo]_
The devinabox and benchmarks repositories are code-only.
The peps and devguide repositories involve the generation of webpages.
And the cpython repository has special requirements for integration
with bugs.python.org [#b.p.o]_.
Migration Plan
==============
The migration plan is separated into sections based on what is
required to migrate the repositories listed in the
`Repositories to Migrate`_ section. Completion of requirements
outlined in each section should unblock the migration of the related
repositories. The sections are expected to be completed in order, but
not necessarily the requirements within a section.
Requirements for Code-Only Repositories
---------------------------------------
Completion of the requirements in this section will allow the
devinabox and benchmarks repositories to move to
GitHub. While devinabox has a sufficiently descriptive name, the
benchmarks repository does not; therefore, it will be named
"python-benchmark-suite".
Create a 'python-dev' team
''''''''''''''''''''''''''
To manage permissions, a 'python-dev' team will be created as part of
the python organization [#github-python-org]_. Any repository that is
moved will have the 'python-dev' team added to it with write
permissions [#github-org-perms]_. Anyone who previously had rights to
manage SSH keys on hg.python.org will become a team maintainer for the
'python-dev' team.
Define commands to move a Mercurial repository to Git
'''''''''''''''''''''''''''''''''''''''''''''''''''''
Since moving to GitHub also entails moving to Git [#git]_, we must
decide what tools and commands we will run to translate a Mercurial
repository to Git. The exact tools and steps to use are an
open issue; see `Tools and commands to move from Mercurial to Git`_.
CLA enforcement
'''''''''''''''
A key part of any open source project is making sure that its source
code can be properly licensed. This requires making sure all people
making contributions have signed a contributor license agreement
(CLA) [#cla]_. Up until now, enforcement of CLA signing of
contributed code has been enforced by core developers checking
whether someone had an ``*`` by their username on
bugs.python.org [#b.p.o]_. With this migration, the plan is to start
off with automated checking and enforcement of contributors signing
the CLA.
Adding GitHub username support to bugs.python.org
+++++++++++++++++++++++++++++++++++++++++++++++++
To keep tracking of CLA signing under the direct control of the PSF,
tracking who has signed the PSF CLA will be continued by marking that
fact as part of someone's bugs.python.org user profile. What this
means is that an association will be needed between a person's
bugs.python.org [#b.p.o]_ account and their GitHub account, which
will be done through a new field in a user's profile.
This does implicitly require that contributors will need both a
GitHub [#github]_ and bugs.python.org account in order to sign the
CLA and contribute through GitHub.
A bot to enforce CLA signing
++++++++++++++++++++++++++++
With an association between someone's GitHub account and their
bugs.python.org [#b.p.o]_ account, which has the data as to whether
someone has signed the CLA, a bot can monitor pull requests on
GitHub and denote whether the contributor has signed the CLA.
If the user has signed the CLA, the bot will add a positive label to
the issue to denote the pull request has no CLA issues (e.g., a green
label stating, "CLA: ✓"). If the contributor has not signed a CLA,
a negative label will be added to the pull request will be blocked
using GitHub's status API (e.g., a red label stating, "CLA: ✗"). If a
contributor lacks a bugs.python.org account, that will lead to
another label (e.g., "CLA: ✗ (no account)"). Using a label for both
positive and negative cases provides a fallback notification if the
bot happens to fail, preventing potential false-positives or
false-negatives. It also allows for an easy way to trigger the bot
again by simply removing a CLA-related label.
Requirements for Web-Related Repositories
-----------------------------------------
Due to their use for generating webpages, the
devguide [#devguide-repo]_ and peps [#peps-repo]_ repositories need
their respective processes updated to pull from their new Git
repositories.
The devguide repository might also need to be named
``python-devguide`` to make sure the repository is not ambiguous
when viewed in isolation from the
python organization [#github-python-org]_.
Requirements for the cpython Repository
---------------------------------------
Obviously the most active and important repository currently hosted
at hg.python.org [#h.p.o]_ is the cpython
repository [#cpython-repo]_. Because of its importance and high-
frequency use, it requires more tooling before being moved to GitHub
compared to the other repositories mentioned in this PEP.
Document steps to commit a pull request
'''''''''''''''''''''''''''''''''''''''
During the process of choosing a new development workflow, it was
decided that a linear history is desired. People preferred having a
single commit representing a single change instead of having a set of
unrelated commits lead to a merge commit that represented a single
change. This means that the convenient "Merge" button in GitHub pull
requests is undesirable, as it creates a merge commit along with all
of the contributor's individual commits (this does not affect the
other repositories where the desire for a linear history doesn't
exist).
Luckily, Git [#git]_ does not require GitHub's workflow and so one can
be chosen which gives us a linear history by using Git's CLI. The
expectation is that all pull requests will be fast-forwarded and
rebased before being pushed to the master repository. This should
give proper attribution to the pull request author in the Git
history. This does have the consequence of losing some GitHub
features such as automatic closing of pull requests, link generation,
etc.
A second set of recommended commands will also be written for
committing a contribution from a patch file uploaded to
bugs.python.org [#b.p.o]_. This will obviously help keep the linear
history, but it will need to be made to have attribution to the patch
author.
The exact sequence of commands that will be given as guidelines to
core developers is an open issue:
`Git CLI commands for committing a pull request to cpython`_.
Handling Misc/NEWS
''''''''''''''''''
Traditionally the ``Misc/NEWS`` file [#news-file]_ has been problematic
for changes which spanned Python releases. Often times there will be
merge conflicts when committing a change between e.g., 3.5 and 3.6
only in the ``Misc/NEWS`` file. It's so common, in fact, that the
example instructions in the devguide explicitly mention how to
resolve conflicts in the ``Misc/NEWS`` file
[#devguide-merge-across-branches]_. As part of our tool
modernization, working with the ``Misc/NEWS`` file will be
simplified.
There are currently two competing approaches to solving the
``Misc/NEWS`` problem which are discussed in an open issue:
`How to handle the Misc/NEWS file`_.
Handling Misc/ACKS
''''''''''''''''''
Traditionally the ``Misc/ACKS`` file [#acks-file]_ has been managed
by hand. But thanks to Git supporting an ``author`` value as well as
a ``committer`` value per commit, authorship of a commit can be part
of the history of the code itself.
As such, manual management of ``Misc/ACKS`` will become optional. A
script will be written that will collect all author and committer
names and merge them into ``Misc/ACKS`` with all of the names listed
prior to the move to Git. Running this script will become part of the
release process.
Linking pull requests to issues
'''''''''''''''''''''''''''''''
Historically, external contributions were attached to an issue on
bugs.python.org [#b.p.o]_ thanks to the fact that all external
contributions were uploaded as a file. For changes committed by a
core developer who committed a change directly, the specifying of an
issue number in the commit message of the format ``Issue #`` at the
start of the message led to a comment being posted to the issue
linking to the commit.
Linking a pull request to an issue
++++++++++++++++++++++++++++++++++
An association between a pull request and an issue is needed to track
when a fix has been proposed. The association needs to be many-to-one
as there can take multiple pull requests to solve a single issue
(technically it should be a many-to-many association for when a
single fix solves multiple issues, but this is fairly rare and issues
can be merged into one using the ``Superseder`` field on the issue
tracker).
Association between a pull request and an issue will be done based on
detecting the regular expression``[Ii]ssue #(?P<bpo_id>\d+)``. If
this is specified in either the title or in the body of a message on
a pull request then connection will be made on
bugs.python.org [#b.p.o]_. A label will also be added to the pull
request to signify that the connection was made successfully. This
could lead to incorrect associations if the wrong issue or
referencing another issue was done, but these are rare occasions.
Notify the issue if the pull request is committed
+++++++++++++++++++++++++++++++++++++++++++++++++
Once a pull request is closed (merged or not), the issue should be
updated to reflect this fact.
Update linking service for mapping commit IDs to URLs
'''''''''''''''''''''''''''''''''''''''''''''''''''''
Currently you can use https://hg.python.org/lookup/ with a revision
ID from either the Subversion or Mercurial copies of the
cpython repo [#cpython-repo]_ to get redirected to the URL for that
revision in the Mercurial repository. The URL rewriter will need to
be updated to redirect to the Git repository and to support the new
revision IDs created for the Git repository.
Create https://git.python.org
'''''''''''''''''''''''''''''
Just as hg.python.org [#h.p.o]_ currently points to the Mercurial
repository for Python, git.python.org should do the equivalent for
the Git repository.
Backup of pull request data
'''''''''''''''''''''''''''
Since GitHub [#github]_ is going to be used for code hosting and code
review, those two things need to be backed up. In the case of code
hosting, the backup is implicit as all non-shallow Git [#git]_ clones
contain the full history of the repository, hence there will be many
backups of the repository.
The code review history does not have the same implicit backup
mechanism as the repository itself. That means a daily backup of code
review history should be done so that it is not lost in case of any
issues with GitHub. It also helps guarantee that a migration from
GitHub to some other code review system is feasible were GitHub to
disappear overnight.
Change sys._mercurial
'''''''''''''''''''''
Once Python is no longer kept in Mercurial, the ``sys._mercurial``
attribute will need to be removed. An equivalent ``sys._git``
attribute will be needed to take its place.
Update the devguide
'''''''''''''''''''
The devguide will need to be updated with details of the new
workflow. Mostly likely work will take place in a separate branch
until the migration actually occurs.
Update PEP 101
''''''''''''''
The release process will need to be updated as necessary.
Optional, Planned Features
--------------------------
Once the cpython repository [#cpython-repo]_ is migrated, all
repositories will have been moved to GitHub [#github]_ and the
development process should be on equal footing as before. But a key
reason for this migration is to improve the development process,
making it better than it has ever been. This section outlines some
plans on how to improve things.
It should be mentioned that overall feature planning for
bugs.python.org [#b.p.o]_ -- which includes plans independent of this
migration -- are tracked on their own wiki page [#tracker-plans]_.
Bot to handle pull request merging
''''''''''''''''''''''''''''''''''
As stated in the section entitled
"`Document steps to commit a pull request`_", the desire is to
maintain a linear history for cpython. Unfortunately,
Github's [#github]_ web-based workflow does not support a linear
history. Because of this, a bot should be written to substitute for
GitHub's in-browser commit abilities.
To start, the bot should accept commands to commit a pull request
against a list of branches. This allows for committing a pull request
that fixes a bug in multiple versions of Python.
More advanced features such as a commit queue can come later. This
would linearly apply accepted pull requests and verify that the
commits did not interfere with each other by running the test suite
and backing out commits if the test run failed. To help facilitate
the speed of testing, all patches committed since the last test run
can be applied and run in a single test run as the optimistic
assumption is that the patches will work in tandem.
Inspiration or basis of the bot could be taken from pre-existig bots
such as Homu [#homu]_ or Zuul [#zuul]_.
The name given to this bot in order to give it commands is an open
issue: `Naming the commit bot`_.
Continuous integration per pull request
'''''''''''''''''''''''''''''''''''''''
To help speed up pull request approvals, continuous integration
testing should be used. This helps mitigate the need for a core
developer to download a patch simply to run the test suite against
the patch.
Which free CI service to use is an open issue:
`Choosing a CI service`_.
Test coverage report
''''''''''''''''''''
Getting an up-to-date test coverage report for Python's standard
library would be extremely beneficial as generating such a report can
take quite a while to produce.
There are a couple pre-existing services that provide free test
coverage for open source projects. Which option is best is an open
issue: `Choosing a test coverage service`_.
Notifying issues of pull request comments
'''''''''''''''''''''''''''''''''''''''''
The current development process does not include notifying an issue
on bugs.python.org [#b.p.o]_ when a review comment is left on
Rietveld [#rietveld]_. It would be nice to fix this so that people
can subscribe only to comments at bugs.python.org and not
GitHub [#github]_ and yet still know when something occurs on GitHub
in terms of review comments on relevant pull requests. Current
thinking is to post a comment to bugs.python.org to the relevant
issue when at least one review comment has been made over a certain
period of time (e.g., 15 or 30 minutes). This keeps the email volume
down for those that receive both GitHub and bugs.python.org email
notifications while still making sure that those only following
bugs.python.org know when there might be a review comment to address.
Allow bugs.python.org to use GitHub as a login provider
'''''''''''''''''''''''''''''''''''''''''''''''''''''''
As of right now, bugs.python.org [#b.p.o]_ allows people to log in
using Google, Launchpad, or OpenID credentials. It would be good to
expand this to GitHub credentials.
Web hooks for re-generating web content
'''''''''''''''''''''''''''''''''''''''
The content at https://docs.python.org/,
https://docs.python.org/devguide, and
https://www.python.org/dev/peps/ are all derived from files kept in
one of the repositories to be moved as part of this migration. As
such, it would be nice to set up appropriate webhooks to trigger
rebuilding the appropriate web content when the files they are based
on change instead of having to wait for, e.g., a cronjob to trigger.
Link web content back to files that it is generated from
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
It would be helpful for people who find issues with any of the
documentation that is generated from a file to have a link on each
page which points back to the file on GitHub [#github]_ that stores
the content of the page. That would allow for quick pull requests to
fix simple things such as spelling mistakes.
Splitting out parts of the documentation into their own repositories
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
While certain parts of the documentation at https://docs.python.org
change with the code, other parts are fairly static and are not
tightly bound to the CPython code itself. The following sections of
the documentation fit this category of slow-changing,
loosely-coupled:
* `Tutorial <https://docs.python.org/3/tutorial/index.html>`__
* `Python Setup and Usage <https://docs.python.org/3/using/index.html>`__
* `HOWTOs <https://docs.python.org/3/howto/index.html>`__
* `Installing Python Modules <
https://docs.python.org/3/installing/index.html>`__
* `Distributing Python Modules <
https://docs.python.org/3/distributing/index.html>`__
* `Extending and Embedding <https://docs.python.org/3/extending/index.html
>`__
* `FAQs <https://docs.python.org/3/faq/index.html>`__
These parts of the documentation could be broken out into their own
repositories to simplify their maintenance and to expand who has
commit rights to them to ease in their maintenance.
It has also been suggested to split out the
`What's New <https://docs.python.org/3/whatsnew/index.html>`__
documents. That would require deciding whether a workflow could be
developed where it would be difficult to forget to update
What's New (potentially through a label added to PRs, like
"What's New needed").
Backup of Git repositories
''''''''''''''''''''''''''
While not necessary, it would be good to have official backups of the
various Git repositories for disaster protection. It will be up to
the PSF infrastructure committee to decide if this is worthwhile or
unnecessary.
Status
======
Requirements for migrating the devinabox [#devinabox-repo]_ and
benchmarks [#benchmarks-repo]_ repositories:
* Not started
- `Create a 'python-dev' team`_
- `Define commands to move a Mercurial repository to Git`_
- `Adding GitHub username support to bugs.python.org`_
- `A bot to enforce CLA signing`_
* In progress
- None
* Completed
- None
Repositories whose build steps need updating:
* Not started
- peps [#peps-repo]_
- devguide [#devguide-repo]_
* In progress
- None
* Completed
- None
Requirements to move over the cpython repo [#cpython-repo]_:
* Not started
- `Document steps to commit a pull request`_
- `Handling Misc/NEWS`_
- `Handling Misc/ACKS`_
- `Linking a pull request to an issue`_
- `Notify the issue if the pull request is committed`_
- `Update linking service for mapping commit IDs to URLs`_
- `Create https://git.python.org`_
- `Backup of pull request data`_
- `Change sys._mercurial`_
- `Update the devguide`_
- `Update PEP 101`_
* In progress
- None
* Completed
- None
Optional features:
* Not started
- `Bot to handle pull request merging`_
- `Continuous integration per pull request`_
- `Test coverage report`_
- `Notifying issues of pull request comments`_
- `Allow bugs.python.org to use GitHub as a login provider`_
- `Web hooks for re-generating web content`_
- `Link web content back to files that it is generated from`_
- `Splitting out parts of the documentation into their own repositories`_
- `Backup of Git repositories`_
* In progress
- None
* Completed
- None
Open Issues
===========
For this PEP, open issues are ones where a decision needs to be made
to how to approach or solve a problem. Open issues do not entail
coordination issues such as who is going to write a certain bit of
code.
The fate of hg.python.org
-------------------------
With the code repositories moving over to Git [#git]_, there is no
technical need to keep hg.python.org [#h.p.o]_ running. Having said
that, some in the community would like to have it stay functioning as
a Mercurial [#hg]_ mirror of the Git repositories. Others have said
that they still want a mirror, but one using Git.
As maintaining hg.python.org is not necessary, it will be up to the
PSF infrastructure committee to decide if they want to spend the
time and resources to keep it running. They may also choose whether
they want to host a Git mirror on PSF infrastructure.
Depending on the decision reached, other ancillary repositories will
either be forced to migration or they can choose to simply stay on
hg.python.org.
Tools and commands to move from Mercurial to Git
------------------------------------------------
A decision needs to be made on exactly what tooling and what commands
involving those tools will be used to convert a Mercurial repository
to Git. Currently a suggestion has been made to use
https://github.com/frej/fast-export. Another suggestion is to use
https://github.com/felipec/git-remote-hg. Finally,
http://hg-git.github.io/ has been suggested.
Git CLI commands for committing a pull request to cpython
---------------------------------------------------------
Because Git [#git]_ may be a new version control system for core
developers, the commands people are expected to run will need to be
written down. These commands also need to keep a linear history while
giving proper attribution to the pull request author.
Another set of commands will also be necessary for when working with
a patch file uploaded to bugs.python.org [#b.p.o]_. Here the linear
history will be kept implicitly, but it will need to make sure to
keep/add attribution.
How to handle the Misc/NEWS file
--------------------------------
There are three competing approaches to handling
``Misc/NEWS`` [#news-file]_. One is to add a news entry for issues on
bugs.python.org [#b.p.o]_. This would mean an issue that is marked
as "resolved" could not be closed until a news entry is added in the
"news" field in the issue tracker. The benefit of tying the news
entry to the issue is it makes sure that all changes worthy of a news
entry have an accompanying issue. It also makes classifying a news
entry automatic thanks to the Component field of the issue. The
Versions field of the issue also ties the news entry to which Python
releases were affected. A script would be written to query
bugs.python.org for relevant new entries for a release and to produce
the output needed to be checked into the code repository. This
approach is agnostic to whether a commit was done by CLI or bot.
A competing approach is to use an individual file per news entry,
containing the text for the entry. In this scenario each feature
release would have its own directory for news entries and a separate
file would be created in that directory that was either named after
the issue it closed or a timestamp value (which prevents collisions).
Merges across branches would have no issue as the news entry file
would still be uniquely named and in the directory of the latest
version that contained the fix. A script would collect all news entry
files no matter what directory they reside in and create an
appropriate news file (the release directory can be ignored as the
mere fact that the file exists is enough to represent that the entry
belongs to the release). Classification can either be done by keyword
in the new entry file itself or by using subdirectories representing
each news entry classification in each release directory (or
classification of news entries could be dropped since critical
information is captured by the "What's New" documents which are
organized). The benefit of this approach is that it keeps the changes
with the code that was actually changed. It also ties the message to
being part of the commit which introduced the change. For a commit
made through the CLI, a script will be provided to help generate the
file. In a bot-driven scenario, the merge bot will have a way to
specify a specific news entry and create the file as part of its
flattened commit (while most likely also supporting using the first
line of the commit message if no specific news entry was specified).
Code for this approach has been written previously for the Mercurial
workflow at http://bugs.python.org/issue18967. There is also tools
from the community like https://pypi.python.org/pypi/towncrier and
https://github.com/twisted/newsbuilder .
A yet third option is a merge script to handle the conflicts. This
approach allows for keeping the NEWS file as a single file. It does
run the risk, though, of failure and thus blocking a commit until it
can be manually resolved.
Naming the commit bot
---------------------
As naming things can lead to bikeshedding of epic proportions, Brett
Cannon will choose the final name of the commit bot (the name of the
project for the bot itself can be anything, this is purely for the
name used in giving commands to the bot). The name will come from
Monty Python, which is only fitting since Python is named after the
comedy troupe. It will most likely come from
'Monty Python and the Holy Grail' [#holy-grail]_ (which happens to be
how Brett was introduced to Monty Python). Current ideas on the name
include:
"Black Knight" sketch [#black-knight-sketch]_:
* black-knight
* none-shall-pass
* just-a-flesh-wound
"Bridge of Death" sketch [#bridge-of-death-sketch]_:
* bridge-keeper
* man-from-scene-24
* five-questions
* what-is-your-quest
* blue-no-green
* air-speed-velocity
* your-favourite-colour
(and that specific spelling; Monty Python is British, after all)
"Killer rabbit" sketch [#killer-rabbit-sketch]_:
* killer-rabbit
* holy-hand-grenade
* 5-is-right-out
"Witch Village" sketch [#witch-village-sketch]_:
* made-of-wood
* burn-her
"French Taunter" sketch [#french-taunter-sketch]_:
* elderberries
* kanigget
"Constitutional Peasants" sketch [#constitutional-peasants-sketch]_:
* dennis
* from-the-masses
"Knights Who Say Ni" sketch [#ni-sketch]_:
* shubbery
* ni
>From "Monty Python and the Holy Grail" in general:
* brave-sir-robin
Choosing a CI service
---------------------
There are various CI services that provide free support for open
source projects hosted on GitHub [#github]_. Two such examples are
Travis [#travis]_ and Codeship [#codeship]_. Whatever solution is
chosen will need to not time out in the time it takes to execute
Python's test suite. It should optimally provide access to multiple C
compilers for more thorough testing. Network access is also
beneficial.
The current CI service for Python is Pypatcher [#pypatcher]_. A
request can be made in IRC to try a patch from
bugs.python.org [#b.p.o]_. The results can be viewed at
https://ci.centos.org/job/cPython-build-patch/ .
Choosing a test coverage service
--------------------------------
Getting basic test coverage of Python's standard library can be
created simply by using coverage.py [#coverage]_. Getting
thorough test coverage is actually quite tricky, with the details
outlined in the devinabox's README [#devinabox-repo]_. It would be
best if a service could be found that would allow for thorough test
coverage, but it might not be feasible.
Free test coverage services include Coveralls [#coveralls]_ and
Codecov [#codecov]_.
Rejected Ideas
==============
Separate Python 2 and Python 3 repositories
-------------------------------------------
It was discussed whether separate repositories for Python 2 and
Python 3 were desired. The thinking was that this would shrink the
overall repository size which benefits people with slow Internet
connections or small bandwidth caps.
In the end it was decided that it was easier logistically to simply
keep all of CPython's history in a single repository.
Commit multi-release changes in bugfix branch first
---------------------------------------------------
As the current development process has changes committed in the
oldest branch first and then merged up to the default branch, the
question came up as to whether this workflow should be perpetuated.
In the end it was decided that committing in the newest branch and
then cherry-picking changes into older branches would work best as
most people will instinctively work off the newest branch and it is a
more common workflow when using Git [#git]_.
Cherry-picking is also more bot-friendly for an in-browser workflow.
In the merge-up scenario, if you were to request a bot to do a merge
and it failed, then you would have to make sure to immediately solve
the merge conflicts if you still allowed the main commit, else you
would need to postpone the entire commit until all merges could be
handled. With a cherry-picking workflow, the main commit could
proceed while postponing the merge-failing cherry-picks. This allows
for possibly distributing the work of managing conflicting merges.
Deriving ``Misc/NEWS`` from the commit logs
-------------------------------------------
As part of the discussion surrounding `Handling Misc/NEWS`_, the
suggestion has come up of deriving the file from the commit logs
itself. In this scenario, the first line of a commit message would be
taken to represent the news entry for the change. Some heuristic to
tie in whether a change warranted a news entry would be used, e.g.,
whether an issue number is listed.
This idea has been rejected due to some core developers preferring to
write a news entry separate from the commit message. The argument is
the first line of a commit message compared to that of a news entry
have different requirements in terms of brevity, what should be said,
etc.
References
==========
.. [#h.p.o] https://hg.python.org
.. [#GitHub] GitHub (https://github.com)
.. [#hg] Mercurial (https://www.mercurial-scm.org/)
.. [#git] Git (https://git-scm.com/)
.. [#b.p.o] https://bugs.python.org
.. [#rietveld] Rietveld (https://github.com/rietveld-codereview/rietveld)
.. [#gitlab] GitLab (https://about.gitlab.com/)
.. [#core-workflow] core-workflow mailing list (
https://mail.python.org/mailman/listinfo/core-workflow)
.. [#guido-keynote] Guido van Rossum's keynote at PyCon US (
https://www.youtube.com/watch?v=G-uKNd5TSBw)
.. [#reasons] Email to core-workflow outlining reasons why GitHub was
selected
(https://mail.python.org/pipermail/core-workflow/2016-January/000345.html
)
.. [#benchmarks-repo] Mercurial repository for the Unified Benchmark Suite
(https://hg.python.org/benchmarks/)
.. [#devinabox-repo] Mercurial repository for devinabox (
https://hg.python.org/devinabox/)
.. [#peps-repo] Mercurial repository of the Python Enhancement Proposals (
https://hg.python.org/peps/)
.. [#devguide-repo] Mercurial repository for the Python Developer's Guide (
https://hg.python.org/devguide/)
.. [#cpython-repo] Mercurial repository for CPython (
https://hg.python.org/cpython/)
.. [#github-python-org] Python organization on GitHub (
https://github.com/python)
.. [#github-org-perms] GitHub repository permission levels
(
https://help.github.com/enterprise/2.4/user/articles/repository-permission-…
)
.. [#cla] Python Software Foundation Contributor Agreement (
https://www.python.org/psf/contrib/contrib-form/)
.. [#news-file] ``Misc/NEWS`` (
https://hg.python.org/cpython/file/default/Misc/NEWS)
.. [#acks-file] ``Misc/ACKS`` (
https://hg.python.org/cpython/file/default/Misc/ACKS)
.. [#devguide-merge-across-branches] Devguide instructions on how to merge
across branches
(
https://docs.python.org/devguide/committing.html#merging-between-different-…
)
.. [#pythocat] Pythocat (https://octodex.github.com/pythocat/)
.. [#tracker-plans] Wiki page for bugs.python.org feature development
(https://wiki.python.org/moin/TrackerDevelopmentPlanning)
.. [#black-knight-sketch] The "Black Knight" sketch from "Monty Python and
the Holy Grail"
(https://www.youtube.com/watch?v=dhRUe-gz690)
.. [#bridge-of-death-sketch] The "Bridge of Death" sketch from "Monty
Python and the Holy Grail"
(https://www.youtube.com/watch?v=cV0tCphFMr8)
.. [#holy-grail] "Monty Python and the Holy Grail" sketches
(https://www.youtube.com/playlist?list=PL-Qryc-SVnnu1MvN3r94Y9atpaRuIoGmp
)
.. [#killer-rabbit-sketch] "Killer rabbit" sketch from "Monty Python and
the Holy Grail"
(
https://www.youtube.com/watch?v=Nvs5pqf-DMA&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#witch-village-sketch] "Witch Village" from "Monty Python and the Holy
Grail"
(
https://www.youtube.com/watch?v=k3jt5ibfRzw&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#french-taunter-sketch] "French Taunter" from "Monty Python and the
Holy Grail"
(
https://www.youtube.com/watch?v=A8yjNbcKkNY&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#constitutional-peasants-sketch] "Constitutional Peasants" from "Monty
Python and the Holy Grail"
(
https://www.youtube.com/watch?v=JvKIWjnEPNY&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#ni-sketch] "Knights Who Say Ni" from "Monty Python and the Holy Grail"
(
https://www.youtube.com/watch?v=zIV4poUZAQo&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#homu] Homu (http://homu.io/)
.. [#zuul] Zuul (http://docs.openstack.org/infra/zuul/)
.. [#travis] Travis (https://travis-ci.org/)
.. [#codeship] Codeship (https://codeship.com/)
.. [#coverage] coverage.py (https://pypi.python.org/pypi/coverage)
.. [#coveralls] Coveralls (https://coveralls.io/)
.. [#codecov] Codecov (https://codecov.io/)
.. [#pypatcher] Pypatcher (https://github.com/kushaldas/pypatcher)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
6
9
Quick background: A little while ago I've added TravisCI integration to
Jython github mirror. Most recently one of Jython contributors asked
whether we can integrate the xunit XML output of Jython's regrtest with
TravisCI. The answer was no, which led me to researching other options.
*Why not TravisCI?*
TravisCI does not support build artifacts. As simple as that, there's no
builtin support for them. After researching for a while, they promised to
add it but have no public timelines. This is quite limiting in my view as
it would be nice to have more than just a binary pass/fail and manually dig
through regrtest output.
One of the positives for TravisCI is Mac OS X support.
https://travis-ci.org/jythontools/jython
*Codeship*
I have not actually used it, but from reading documentation it feels more
like a deployment service and does not have builtin support for xunit.
*Shippable*
Configuration format is super similar to TravisCI, which made it easy to
try. Has support for both code coverage and xunit test results. Simple UI.
One big negative for me with it is that it is slow, and by slow i mean over
1h versus 25min for TravisCI. Presumably paid version is faster.
https://app.shippable.com/builds/569eebaed3a5e70d00b309c5
*CircleCI* - My preferred
Very fast, free tier gives 4 parallel builds (free tier), supports xunit,
supports Mac OS X (have not tried), and supports debugging via SSH (which i
think is a cool feature for those cases when it works on your machine and
not on the build server and there are no clues as to why). Supports
exporting coverage to several coverage services.
On the negatives, different configuration format to TravisCI, UI could be
simpler/easier.
https://circleci.com/gh/darjus/jython/3#tests
Cheers,
Darjus
2
2
Here is the first complete draft of the PEP laying out what needs to happen
to migrate to GitHub (along with some extras that would be nice). The PEP
can also be viewed at
https://github.com/brettcannon/github-transition-pep/blob/master/pep-0512.r…
or
at https://www.python.org/dev/peps/ once the PEP index is updated.
My hope is that I captured all of the requirements for the various
repositories. If we can agree that is the case then we can start work on
the various requirements and start the migration work!
----------
PEP: 512
Title: Migrating from hg.python.org to GitHub
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon <brett(a)python.org>
Status: Active
Type: Process
Content-Type: text/x-rst
Created:
Post-History: 17-Jan-2015
Abstract
========
This PEP outlines the steps required to migrate Python's development
process from Mercurial [#hg]_ as hosted at
hg.python.org [#h.p.o]_ to Git [#git]_ on GitHub [#GitHub]_. Meeting
the minimum goals of this PEP should allow for the development
process of Python to be as productive as it currently is, and meeting
its extended goals should improve it.
Rationale
=========
In 2014, it became obvious that Python's custom development
process was becoming a hindrance. As an example, for an external
contributor to submit a fix for a bug that eventually was committed,
the basic steps were:
1. Open an issue for the bug at bugs.python.org [#b.p.o]_.
2. Checkout out the CPython source code from hg.python.org [#h.p.o]_.
3. Make the fix.
4. Upload a patch.
5. Have a core developer review the patch using our fork of the
Rietveld code review tool [#rietveld]_.
6. Download the patch to make sure it still applies cleanly.
7. Run the test suite manually.
8. Commit the change manually.
9. If the change was for a bugfix release, merge into the
in-development branch.
10. Run the test suite manually again.
11. Commit the merge.
12. Push the changes.
This is a very heavy, manual process for core developers. Even in the
simple case, you could only possibly skip the code review step, as you
would still need to build the documentation. This led to patches
languishing on the issue tracker due to core developers not being
able to work through the backlog fast enough to keep up with
submissions. In turn, that led to a side-effect issue of discouraging
outside contribution due to frustration from lack of attention, which
is dangerous problem for an open source project as it runs counter to
having a viable future for the project. Simply accepting patches
uploaded to bugs.python.org [#b.p.o]_ is potentially simple for an
external contributor, it is as slow and burdensome as it gets for
a core developer to work with.
Hence the decision was made in late 2014 that a move to a new
development process was needed. A request for PEPs
proposing new workflows was made, in the end leading to two:
PEP 481 and PEP 507 proposing GitHub [#github]_ and
GitLab [#gitlab]_, respectively.
The year 2015 was spent off-and-on working on those proposals and
trying to tease out details of what made them different from each
other on the core-workflow mailing list [#core-workflow]_.
PyCon US 2015 also showed that the community was a bit frustrated
with our process due to both cognitive overhead for new contributors
and how long it was taking for core developers to
look at a patch (see the end of Guido van Rossum's
keynote at PyCon US 2015 [#guido-keynote]_ as an example of the
frustration).
On January 1, 2016, the decision was made by Brett Cannon to move the
development process to GitHub. The key reasons for choosing GitHub
were [#reasons]_:
* Maintaining custom infrastructure has been a burden on volunteers
(e.g., a custom fork of Rietveld [#rietveld]_
that is not being maintained is currently being used).
* The custom workflow is very time-consuming for core developers
(not enough automated tooling built to help support it).
* The custom workflow is a hindrance to external contributors
(acts as a barrier of entry due to time required to ramp up on
development process).
* There is no feature differentiating GitLab from GitHub beyond
GitLab being open source.
* Familiarity with GitHub is far higher amongst core developers and
external contributors than with GitLab.
* Our BDFL prefers GitHub (who would be the first person to tell
you that his opinion shouldn't matter, but the person making the
decision felt it was important that the BDFL feel comfortable with
the workflow of his own programming language to encourage his
continued participation).
There's even already an unofficial image to use to represent the
migration to GitHub [#pythocat]_.
The overarching goal of this migration is to improve the development
process to the extent that a core developer can go from external
contribution submission through all the steps leading to committing
said contribution all from within a browser on a tablet with WiFi.
All of this will be done in such a way that if an external
contributor chooses not to use GitHub then they will continue to have
that option.
Repositories to Migrate
=======================
While hg.python.org [#h.p.o]_ hosts many repositories, there are only
six key repositories that must move:
1. devinabox [#devinabox-repo]_
2. benchmarks [#benchmarks-repo]_
3. tracker [#tracker-repo]_
4. peps [#peps-repo]_
5. devguide [#devguide-repo]_
6. cpython [#cpython-repo]_
The devinabox, benchmarksm and tracker repositories are code-only.
The peps and devguide repositories involve the generation of webpages.
And the cpython repository has special requirements for integration
with bugs.python.org [#b.p.o]_.
Migration Plan
==============
The migration plan is separated into sections based on what is
required to migrate the repositories listed in the
`Repositories to Migrate`_ section. Completion of requirements
outlined in each section should unblock the migration of the related
repositories. The sections are expected to be completed in order, but
not necessarily the requirements within a section.
Requirements for Code-Only Repositories
---------------------------------------
Completion of the requirements in this section will allow the
devinabox, benchmarks, and tracker repositories to move to
GitHub. While devinabox has a sufficiently descriptive name, the
benchmarks repository does not; therefore, it will be named
"python-benchmark-suite". The tracker repo also has a misleading name
and will be renamed "bugs.python.org".
Create a 'python-dev' team
''''''''''''''''''''''''''
To manage permissions, a 'python-dev' team will be created as part of
the python organization [#github-python-org]_. Any repository that is
moved will have the 'python-dev' team added to it with write
permissions [#github-org-perms]_. Anyone who previously had rights to
manage SSH keys on hg.python.org will become a team maintainer for the
'python-dev' team.
Define commands to move a Mercurial repository to Git
'''''''''''''''''''''''''''''''''''''''''''''''''''''
Since moving to GitHub also entails moving to Git [#git]_, we must
decide what tools and commands we will run to translate a Mercurial
repository to Git. The exact tools and steps to use are an
open issue; see `Tools and commands to move from Mercurial to Git`_.
CLA enforcement
'''''''''''''''
A key part of any open source project is making sure that its source
code can be properly licensed. This requires making sure all people
making contributions have signed a contributor license agreement
(CLA) [#cla]_. Up until now, enforcement of CLA signing of
contributed code has been enforced by core developers checking
whether someone had an ``*`` by their username on
bugs.python.org [#b.p.o]_. With this migration, the plan is to start
off with automated checking and enforcement of contributors signing
the CLA.
Adding GitHub username support to bugs.python.org
+++++++++++++++++++++++++++++++++++++++++++++++++
To keep tracking of CLA signing under the direct control of the PSF,
tracking who has signed the PSF CLA will be continued by marking that
fact as part of someone's bugs.python.org user profile. What this
means is that an association will be needed between a person's
bugs.python.org [#b.p.o]_ account and their GitHub account, which
will be done through a new field in a user's profile.
A bot to enforce CLA signing
++++++++++++++++++++++++++++
With an association between someone's GitHub account and their
bugs.python.org [#b.p.o]_ account, which has the data as to whether
someone has signed the CLA, a bot can monitor pull requests on
GitHub and denote whether the contributor has signed the CLA.
If the user has signed the CLA, the bot will add a positive label to
the issue to denote the pull request has no CLA issues (e.g., a green
label stating, "CLA: ✓"). If the contributor has not signed a CLA,
a negative label will be added to the pull request will be blocked
using GitHub's status API (e.g., a red label stating, "CLA: ✗"). If a
contributor lacks a bugs.python.org account, that will lead to
another label (e.g., "CLA: ✗ (no account)"). Using a label for both
positive and negative cases provides a fallback notification if the
bot happens to fail, preventing potential false-positives or
false-negatives. It also allows for an easy way to trigger the bot
again by simply removing a CLA-related label.
Requirements for Web-Related Repositories
-----------------------------------------
Due to their use for generating webpages, the
devguide [#devguide-repo]_ and peps [#peps-repo]_ repositories need
their respective processes updated to pull from their new Git
repositories.
The devguide repository might also need to be named
``python-devguide`` to make sure the repository is not ambiguous
when viewed in isolation from the
python organization [#github-python-org]_.
Requirements for the cpython Repository
---------------------------------------
Obviously the most active and important repository currently hosted
at hg.python.org [#h.p.o]_ is the cpython
repository [#cpython-repo]_. Because of its importance and high-
frequency use, it requires more tooling before being moved to GitHub
compared to the other repositories mentioned in this PEP.
Document steps to commit a pull request
'''''''''''''''''''''''''''''''''''''''
During the process of choosing a new development workflow, it was
decided that a linear history is desired. This means that the
convenient "Merge" button in GitHub pull requests is undesireable, as
it creates a merge commit along with all of the contributor's
individual commits (this does not affect the other repositories where
the desire for a linear history doesn't exist).
Luckily, Git [#git]_ does not require GitHub's workflow and so one can
be chosen which gives us a linear history by using Git's CLI. The
expectation is that all pull requests will be fast-forwarded and
rebased before being pushed to the master repository. This should
give proper attribution to the pull request author in the Git
history.
A second set of recommended commands will also be written for
committing a contribution from a patch file uploaded to
bugs.python.org [#b.p.o]_. This will obviously help keep the linear
history, but it will need to be made to have attribution to the patch
author.
The exact sequence of commands that will be given as guidelines to
core developers is an open issue:
`Git CLI commands for committing a pull request to cpython`_.
Handling Misc/NEWS
''''''''''''''''''
Traditionally the ``Misc/NEWS`` file [#news-file]_ has been problematic
for changes which spanned Python releases. Often times there will be
merge conflicts when committing a change between e.g., 3.5 and 3.6
only in the ``Misc/NEWS`` file. It's so common, in fact, that the
example instructions in the devguide explicitly mention how to
resolve conflicts in the ``Misc/NEWS`` file
[#devguide-merge-across-branches]_. As part of our tool
modernization, working with the ``Misc/NEWS`` file will be
simplified.
There are currently two competing approaches to solving the
``Misc/NEWS`` problem which are discussed in an open issue:
`How to handle the Misc/NEWS file`_.
Handling Misc/ACKS
''''''''''''''''''
Traditionally the ``Misc/ACKS`` file [#acks-file]_ has been managed
by hand. But thanks to Git supporting an ``author`` value as well as
a ``committer`` value per commit, authorship of a commit can be part
of the history of the code itself.
As such, manual management of ``Misc/ACKS`` will become optional. A
script will be written that will collect all author and committer
names and merge them into ``Misc/ACKS`` with all of the names listed
prior to the move to Git. Running this script will become part of the
release process.
Linking pull requests to issues
'''''''''''''''''''''''''''''''
Historically, external contributions were attached to an issue on
bugs.python.org [#b.p.o]_ thanks to the fact that all external
contributions were uploaded as a file. For changes committed by a
core developer who committed a change directly, the specifying of an
issue number in the commit message of the format ``Issue #`` at the
start of the message led to a comment being posted to the issue
linking to the commit.
Linking a pull request to an issue
++++++++++++++++++++++++++++++++++
An association between a pull request and an issue is needed to track
when a fix has been proposed. The association needs to be many-to-one
as there can take multiple pull requests to solve a single issue
(technically it should be a many-to-many association for when a
single fix solves multiple issues, but this is fairly rare and issues
can be merged into one using the ``Superceder`` field on the issue
tracker).
Association between a pull request and an issue will be done based on
detecting the regular expression``[Ii]ssue #(?P<bpo_id>\d+)``. If
this is specified in either the title or in the body of a message on
a pull request then connection will be made on
bugs.python.org [#b.p.o]_. A label will also be added to the pull
request to signify that the connection was made successfully. This
could lead to incorrect associations if the wrong issue or
referencing another issue was done, but these are rare occasions.
Notify the issue if the pull request is committed
+++++++++++++++++++++++++++++++++++++++++++++++++
Once a pull request is closed (merged or not), the issue should be
updated to reflect this fact.
Update linking service for mapping commit IDs to URLs
'''''''''''''''''''''''''''''''''''''''''''''''''''''
Currently you can use https://hg.python.org/lookup/ with a revision
ID from either the Subversion or Mercurial copies of the
cpython repo [#cpython-repo]_ to get redirected to the URL for that
revision in the Mercurial repository. The URL rewriter will need to
be updated to redirect to the Git repository and to support the new
revision IDs created for the Git repository.
Create https://git.python.org
'''''''''''''''''''''''''''''
Just as hg.python.org [#h.p.o]_ currently points to the Mercurial
repository for Python, git.python.org should do the equivalent for
the Git repository.
Backup of pull request data
'''''''''''''''''''''''''''
Since GitHub [#github]_ is going to be used for code hosting and code
review, those two things need to be backed up. In the case of code
hosting, the backup is implicit as all non-shallow Git [#git]_ clones
contain the full history of the repository, hence there will be many
backups of the repository.
The code review history does not have the same implicit backup
mechanism as the repository itself. That means a daily backup of code
review history should be done so that it is not lost in case of any
issues with GitHub. It also helps guarantee that a migration from
GitHub to some other code review system is feasible were GitHub to
disappear overnight.
Change sys._mercurial
'''''''''''''''''''''
Once Python is no longer kept in Mercurial, the ``sys._mercurial``
attribute will need to be removed. An equivalent ``sys._git``
attribute will be needed to take its place.
Optional, Planned Features
--------------------------
Once the cpython repository [#cpython-repo]_ is migrated, all
repositories will have been moved to GitHub [#github]_ and the
development process should be on equal footing as before. But a key
reason for this migration is to improve the development process,
making it better than it has ever been. This section outlines some
plans on how to improve things.
It should be mentioned that overall feature planning for
bugs.python.org [#b.p.o]_ -- which includes plans independent of this
migration -- are tracked on their own wiki page [#tracker-plans]_.
Bot to handle pull request merging
''''''''''''''''''''''''''''''''''
As stated in the section entitled
"`Document steps to commit a pull request`_", the desire is to
maintain a linear history for cpython. Unfortunately,
Github's [#github]_ web-based workflow does not support a linear
history. Because of this, a bot should be written to substitute for
GitHub's in-browser commit abilities.
To start, the bot should accept commands to commit a pull request
against a list of branches. This allows for committing a pull request
that fixes a bug in multiple versions of Python.
More advanced features such as a commit queue can come later. This
would linearly apply accepted pull requests and verify that the
commits did not interfere with each other by running the test suite
and backing out commits if the test run failed. To help facilitate
the speed of testing, all patches committed since the last test run
can be applied and run in a single test run as the optimistic
assumption is that the patches will work in tandem.
Inspiration or basis of the bot could be taken from pre-existig bots
such as Homu [#homu]_ or Zuul [#zuul]_.
The name given to this bot in order to give it commands is an open
issue: `Naming the commit bot`_.
Continuous integration per pull request
'''''''''''''''''''''''''''''''''''''''
To help speed up pull request approvals, continuous integration
testing should be used. This helps mitigate the need for a core
developer to download a patch simply to run the test suite against
the patch.
Which free CI service to use is an open issue:
`Choosing a CI service`_.
Test coverage report
''''''''''''''''''''
Getting an up-to-date test coverage report for Python's standard
library would be extremely beneficial as generating such a report can
take quite a while to produce.
There are a couple pre-existing services that provide free test
coverage for open source projects. Which option is best is an open
issue: `Choosing a test coverage service`_.
Notifying issues of pull request comments
'''''''''''''''''''''''''''''''''''''''''
The current development process does not include notifying an issue
on bugs.python.org [#b.p.o]_ when a review comment is left on
Rietveld [#rietveld]_. It would be nice to fix this so that people
can subscribe only to comments at bugs.python.org and not
GitHub [#github]_ and yet still know when something occurs on GitHub
in terms of review comments on relevant pull requests. Current
thinking is to post a comment to bugs.python.org to the relevant
issue when at least one review comment has been made over a certain
period of time (e.g., 15 or 30 minutes). This keeps the email volume
down for those that receive both GitHub and bugs.python.org email
notifications while still making sure that those only following
bugs.python.org know when there might be a review comment to address.
Allow bugs.python.org to use GitHub as a login provider
'''''''''''''''''''''''''''''''''''''''''''''''''''''''
As of right now, bugs.python.org [#b.p.o]_ allows people to log in
using Google, Launchpad, or OpenID credentials. It would be good to
expand this to GitHub credentials.
Web hooks for re-generating web content
'''''''''''''''''''''''''''''''''''''''
The content at https://docs.python.org/,
https://docs.python.org/devguide, and
https://www.python.org/dev/peps/ are all derived from files kept in
one of the repositories to be moved as part of this migration. As
such, it would be nice to set up appropriate webhooks to trigger
rebuilding the appropriate web content when the files they are based
on change instead of having to wait for, e.g., a cronjob to trigger.
Link web content back to files that it is generated from
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
It would be helpful for people who find issues with any of the
documentation that is generated from a file to have a link on each
page which points back to the file on GitHub [#github]_ that stores
the content of the page. That would allow for quick pull requests to
fix simple things such as spelling mistakes.
Splitting out parts of the documentation into their own repositories
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
While certain parts of the documentation at https://docs.python.org
change with the code, other parts are fairly static and are not
tightly bound to the CPython code itself. The following sections of
the documentation fit this category of slow-changing,
loosely-coupled:
* `Tutorial <https://docs.python.org/3/tutorial/index.html>`__
* `Python Setup and Usage <https://docs.python.org/3/using/index.html>`__
* `HOWTOs <https://docs.python.org/3/howto/index.html>`__
* `Installing Python Modules <
https://docs.python.org/3/installing/index.html>`__
* `Distributing Python Modules <
https://docs.python.org/3/distributing/index.html>`__
* `Extending and Embedding <https://docs.python.org/3/extending/index.html
>`__
* `FAQs <https://docs.python.org/3/faq/index.html>`__
These parts of the documentation could be broken out into their own
repositories to simplify their maintenance and to expand who has
commit rights to them to ease in their maintenance.
Status
======
Requirements for migrating the devinabox [#devinabox-repo]_,
benchmarks [#benchmarks-repo]_, and tracker [#tracker-repo]_
repositories:
* Not started
- `Create a 'python-dev' team`_
- `Define commands to move a Mercurial repository to Git`_
- `Adding GitHub username support to bugs.python.org`_
- `A bot to enforce CLA signing`_
* In progress
- None
* Completed
- None
Repositories whose build steps need updating:
* Not started
- peps [#peps-repo]_
- devguide [#devguide-repo]_
* In progress
- None
* Completed
- None
Requirements to move over the cpython repo [#cpython-repo]_:
* Not started
- `Document steps to commit a pull request`_
- `Handling Misc/NEWS`_
- `Handling Misc/ACKS`_
- `Linking a pull request to an issue`_
- `Notify the issue if the pull request is committed`_
- `Update linking service for mapping commit IDs to URLs`_
- `Create https://git.python.org`_
- `Backup of pull request data`_
- `Change sys._mercurial`_
* In progress
- None
* Completed
- None
Optional features:
* Not started
- `Bot to handle pull request merging`_
- `Continuous integration per pull request`_
- `Test coverage report`_
- `Notifying issues of pull request comments`_
- `Allow bugs.python.org to use GitHub as a login provider`_
- `Web hooks for re-generating web content`_
- `Link web content back to files that it is generated from`_
- `Splitting out parts of the documentation into their own repositories`_
* In progress
- None
* Completed
- None
Open Issues
===========
For this PEP, open issues are ones where a decision needs to be made
to how to approach or solve a problem. Open issues do not entail
coordination issues such as who is going to write a certain bit of
code.
The fate of hg.python.org
-------------------------
With the code repositories moving over to Git [#git]_, there is no
technical need to keep hg.python.org [#h.p.o]_ running. Having said
that, some in the community would like to have it stay functioning as
a Mercurial [#hg]_ mirror of the Git repositories. Others have said
that they still want a mirror, but one using Git.
As maintaining hg.python.org is not necessary, it will be up to the
PSF infrastructure committee to decide if they want to spend the
time and resources to keep it running. They may also choose whether
they want to host a Git mirror on PSF infrastructure.
Tools and commands to move from Mercurial to Git
------------------------------------------------
A decision needs to be made on exactly what tooling and what commands
involving those tools will be used to convert a Mercurial repository
to Git. Currently a suggestion has been made to use
https://github.com/frej/fast-export. Another suggestion is to use
https://github.com/felipec/git-remote-hg. Finally,
http://hg-git.github.io/ has been suggested.
Git CLI commands for committing a pull request to cpython
---------------------------------------------------------
Because Git [#git]_ may be a new version control system for core
developers, the commands people are expected to run will need to be
written down. These commands also need to keep a linear history while
giving proper attribution to the pull request author.
Another set of commands will also be necessary for when working with
a patch file uploaded to bugs.python.org [#b.p.o]_. Here the linear
history will be kept implicitly, but it will need to make sure to
keep/add attribution.
How to handle the Misc/NEWS file
--------------------------------
There are two competing approaches to handling
``Misc/NEWS`` [#news-file]_. One is to add a news entry for issues on
bugs.python.org [#b.p.o]_. This would mean an issue that is marked
as "resolved" could not be closed until a news entry is added in the
"news" field in the issue tracker. The benefit of tying the news
entry to the issue is it makes sure that all changes worthy of a news
entry have an accompanying issue. It also makes classifying a news
entry automatic thanks to the Component field of the issue. The
Versions field of the issue also ties the news entry to which Python
releases were affected. A script would be written to query
bugs.python.org for relevant new entries for a release and to produce
the output needed to be checked into the code repository. This
approach is agnostic to whether a commit was done by CLI or bot.
The competing approach is to use an individual file per news entry,
containg the text for the entry. In this scenario each feature
release would have its own directory for news entries and a separate
file would be created in that directory that was either named after
the issue it closed or a timestamp value (which prevents collisions).
Merges across branches would have no issue as the news entry file
would still be uniqeuely named and in the directory of the latest
version that contained the fix. A script would collect all news entry
files no matter what directory they reside in and create an
appropriate news file (the release directory can be ignored as the
mere fact that the file exists is enough to represent that the entry
belongs to the release). Classification can either be done by keyword
in the new entry file itself or by using subdirectories representing
each news entry classification in each release directory (or
classification of news entries could be dropped since critical
information is captured by the "What's New" documents which are
organized). The benefit of this approach is that it keeps the changes
with the code that was actually changed. It also ties the message to
being part of the commit which introduced the change. For a commit
made through the CLI, a script will be provided to help generate the
file. In a bot-driven scenario, the merge bot will have a way to
specify a specific news entry and create the file as part of its
flattened commit (while most likely also supporting using the first
line of the commit message if no specific news entry was specified).
Code for this approach has been written previously for the Mercurial
workflow at http://bugs.python.org/issue18967. There is also a tool
from the community at https://pypi.python.org/pypi/towncrier.
Naming the commit bot
---------------------
As naming things can lead to bikeshedding of epic proportions, Brett
Cannon will choose the final name of the commit bot (the name of the
project for the bot itself can be anything, this is purely for the
name used in giving commands to the bot). The name will come from
Monty Python, which is only fitting since Python is named after the
comedy troupe. It will most likely come from
'Monty Python and the Holy Grail' [#holy-grail]_ (which happens to be
how Brett was introduced to Monty Python). Current ideas on the name
include:
"Black Knight" sketch [#black-knight-sketch]_:
* black-knight
* none-shall-pass
* just-a-flesh-wound
"Bridge of Death" sketch [#bridge-of-death-sketch]_:
* bridge-keeper
* man-from-scene-24
* five-questions
* what-is-your-quest
* blue-no-green
* air-speed-velocity
* your-favourite-colour
(and that specific spelling; Monty Python is British, after all)
"Killer rabbit" sketch [#killer-rabbit-sketch]_:
* killer-rabbit
* holy-hand-grenade
* 5-is-right-out
"Witch Village" sketch [#witch-village-sketch]_:
* made-of-wood
* burn-her
"French Taunter" sketch [#french-taunter-sketch]_:
* elderberries
* kanigget
"Constitutional Peasants" sketch [#constitutional-peasants-sketch]_:
* dennis
* from-the-masses
* watery-tart
"Knights Who Say Ni" sketch [#ni-sketch]_:
* shubbery
* ni
>From "Monty Python and the Holy Grail" in general:
* brave-sir-robin
Choosing a CI service
---------------------
There are various CI services that provide free support for open
source projects hosted on GitHub [#github]_. Two such examples are
Travis [#travis]_ and Codeship [#codeship]_. Whatever solution is
chosen will need to not time out in the time it takes to execute
Python's test suite. It should optimally provide access to multiple C
compilers for more thorough testing. Network access is also
beneficial.
The current CI service for Python is Pypatcher [#pypatcher]_. A
request can be made in IRC to try a patch from
bugs.python.org [#b.p.o]_. The results can be viewed at
https://ci.centos.org/job/cPython-build-patch/ .
Choosing a test coverage service
--------------------------------
Getting basic test coverage of Python's standard library can be
created simply by using coverage.py [#coverage]_. Getting
thorough test coverage is actually quite tricky, with the details
outlined in the devinabox's README [#devinabox-repo]_. It would be
best if a service could be found that would allow for thorough test
coverage, but it might not be feasible.
Free test coverage services include Coveralls [#coveralls]_ and
Codecov [#codecov]_.
Rejected Ideas
==============
Separate Python 2 and Python 3 repositories
-------------------------------------------
It was discussed whether separate repositories for Python 2 and
Python 3 were desired. The thinking was that this would shrink the
overall repository size which benefits people with slow Internet
connections or small bandwidth caps.
In the end it was decided that it was easier logistically to simply
keep all of CPython's history in a single repository.
Commit multi-release changes in bugfix branch first
---------------------------------------------------
As the current development process has changes committed in the
oldest branch first and then merged up to the default branch, the
question came up as to whether this workflow should be perpetuated.
In the end it was decided that committing in the newest branch and
then cherrypicking changes into older branches would work best as
most people will instinctively work off the newest branch and it is a
more common workflow when using Git [#git]_.
Deriving ``Misc/NEWS`` from the commit logs
-------------------------------------------
As part of the discussion surrounding `Handling Misc/NEWS`_, the
suggestion has come up of deriving the file from the commit logs
itself. In this scenario, the first line of a commit message would be
taken to represent the news entry for the change. Some heuristic to
tie in whether a change warranted a news entry would be used, e.g.,
whether an issue number is listed.
This idea has been rejected due to some core developers preferring to
write a news entry separate from the commit message. The argument is
the first line of a commit message compared to that of a news entry
have different requirements in terms of brevity, what should be said,
etc.
References
==========
.. [#h.p.o] https://hg.python.org
.. [#GitHub] GitHub (https://github.com)
.. [#hg] Mercurial (https://www.mercurial-scm.org/)
.. [#git] Git (https://git-scm.com/)
.. [#b.p.o] https://bugs.python.org
.. [#rietveld] Rietveld (https://github.com/rietveld-codereview/rietveld)
.. [#gitlab] GitLab (https://about.gitlab.com/)
.. [#core-workflow] core-workflow mailing list (
https://mail.python.org/mailman/listinfo/core-workflow)
.. [#guido-keynote] Guido van Rossum's keynote at PyCon US (
https://www.youtube.com/watch?v=G-uKNd5TSBw)
.. [#reasons] Email to core-workflow outlining reasons why GitHub was
selected
(https://mail.python.org/pipermail/core-workflow/2016-January/000345.html
)
.. [#benchmarks-repo] Mercurial repository for the Unified Benchmark Suite
(https://hg.python.org/benchmarks/)
.. [#devinabox-repo] Mercurial repository for devinabox (
https://hg.python.org/devinabox/)
.. [#peps-repo] Mercurial repository of the Python Enhancement Proposals (
https://hg.python.org/peps/)
.. [#tracker-repo] bugs.python.org code repository (
https://hg.python.org/tracker/python-dev/)
.. [#devguide-repo] Mercurial repository for the Python Developer's Guide (
https://hg.python.org/devguide/)
.. [#cpython-repo] Mercurial repository for CPython (
https://hg.python.org/cpython/)
.. [#github-python-org] Python organization on GitHub (
https://github.com/python)
.. [#github-org-perms] GitHub repository permission levels
(
https://help.github.com/enterprise/2.4/user/articles/repository-permission-…
)
.. [#cla] Python Software Foundation Contributor Agreement (
https://www.python.org/psf/contrib/contrib-form/)
.. [#news-file] ``Misc/NEWS`` (
https://hg.python.org/cpython/file/default/Misc/NEWS)
.. [#acks-file] ``Misc/ACKS`` (
https://hg.python.org/cpython/file/default/Misc/ACKS)
.. [#devguide-merge-across-branches] Devguide instructions on how to merge
across branches
(
https://docs.python.org/devguide/committing.html#merging-between-different-…
)
.. [#pythocat] Pythocat (https://octodex.github.com/pythocat/)
.. [#tracker-plans] Wiki page for bugs.python.org feature development
(https://wiki.python.org/moin/TrackerDevelopmentPlanning)
.. [#black-knight-sketch] The "Black Knight" sketch from "Monty Python and
the Holy Grail"
(https://www.youtube.com/watch?v=dhRUe-gz690)
.. [#bridge-of-death-sketch] The "Bridge of Death" sketch from "Monty
Python and the Holy Grail"
(https://www.youtube.com/watch?v=cV0tCphFMr8)
.. [#holy-grail] "Monty Python and the Holy Grail" sketches
(https://www.youtube.com/playlist?list=PL-Qryc-SVnnu1MvN3r94Y9atpaRuIoGmp
)
.. [#killer-rabbit-sketch] "Killer rabbit" sketch from "Monty Python and
the Holy Grail"
(
https://www.youtube.com/watch?v=Nvs5pqf-DMA&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#witch-village-sketch] "Witch Village" from "Monty Python and the Holy
Grail"
(
https://www.youtube.com/watch?v=k3jt5ibfRzw&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#french-taunter-sketch] "French Taunter" from "Monty Python and the
Holy Grail"
(
https://www.youtube.com/watch?v=A8yjNbcKkNY&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#constitutional-peasants-sketch] "Constitutional Peasants" from "Monty
Python and the Holy Grail"
(
https://www.youtube.com/watch?v=JvKIWjnEPNY&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#ni-sketch] "Knights Who Say Ni" from "Monty Python and the Holy Grail"
(
https://www.youtube.com/watch?v=zIV4poUZAQo&list=PL-Qryc-SVnnu1MvN3r94Y9atp…
)
.. [#homu] Homu (http://homu.io/)
.. [#zuul] Zuul (http://docs.openstack.org/infra/zuul/)
.. [#travis] Travis (https://travis-ci.org/)
.. [#codeship] Codeship (https://codeship.com/)
.. [#coverage] coverage.py (https://pypi.python.org/pypi/coverage)
.. [#coveralls] Coveralls (https://coveralls.io/)
.. [#codecov] Codecov (https://codecov.io/)
.. [#pypatcher] Pypatcher (https://github.com/kushaldas/pypatcher)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
8
17
I'm developing it at
https://github.com/brettcannon/github-transition-pep/blob/master/pep-NNNN.r… .
I'm not posting it here as I'm still actively writing it. The only reason
I'm mentioning it now is because the migration plan has been very roughly
outlined, so if it looks like I'm missing something, please let me know.
5
10
So consider this the starting discussion of the PEP that will be the
hg.python.org -> GitHub transition PEP that I will be in charge of. Once we
have general agreement on the steps necessary I will start the actual PEP
and check it in, but I figure there's no point in have a skeleton PEP if we
can't agree on the skeleton. :) While I list steps influencing all the
repos, I want to focus on the ones stopping any repo from moving over for
now, expanding what we worry about to the cpython repo as we knock blockers
down until we move everything over and start adding GitHub perks.
The way I see it, we have 4 repos to move: devinabox, benchmarks, peps,
devguide, and cpython. I also think that's essentially the order we should
migrate them over. Some things will need to be universally handled before
we transition a single repo, while other things are only a blocker for some
of the repos.
Universal blockers
==============
There are four blockers that must be resolved before we even consider
moving a repo over. They can be solved in parallel, but they all need to
have a selected solution before we can move even the devinabox repo.
First, we need to decide how we are going to handle adding all the core
devs to GitHub. Are we simply going to add all of them to the python
organization, or do we want something like a specific python-dev gteamroup
that gets added to all of the relevant repos? Basically I'm not sure how
picky we want to be about the people who already have organization access
on GitHub about them implicitly getting access to the cpython repo at the
end of the day (and the same goes for any of the other repos in the python
organization). For tracking names, I figure we will create a file in the
devguide where people can check in their GitHub usernames and I can
manually add people as people add themselves to the file.
Second, CLA enforcement. As of right now people go to
https://www.python.org/psf/contrib/contrib-form/, fill in the form, and
then Ewa gets an email where she manually flips a flag in Roundup. If we
want to use a web hook to verify someone has signed a CLA then we need to
decide where the ground truth for CLAs are. Do we want to keep using
Roundup to manage CLA agreements and thus add a GitHub field in
bugs.python.org for people's profile and a web hook or bot that will signal
if someone has the flag flipped on bugs.python.org? Or is there some
prepackaged service that we can use that will keep track of this which
would cause us to not use Roundup (which might be easier, but depending on
the service require everyone to re-sign)? There's also the issue of
supporting people who want to submit code by uploading a patch to
bugs.python.org but not use GitHub. Either way I don't want to have to ask
everyone who submits a PR what their bugs.python.org username is and then
go check that manually.
Third, how do we want to do the repo conversions? We need to choose the
tool(s) and command(s) that we want to use. There was mention of wanting a
mapping from hg commit ID to git commit ID. If we have that we could have a
static bugs.python.org/commit/<ID> page that had the mapping embedded in
some JavaScript and if <ID> matched then we just forward them to the
corresponding GitHub commit page, otherwise just blindly forward to GitHub
and assume the ID is git-only, giving us a stable URL for commit web views.
Fourth, for the ancillary repos of devinabox, peps, benchmarks, and
devguide, do we care if we use the GitHub merge button for PRs or do we
want to enforce a linear history with all repos? We just need to decide if
care about linear histories and then we can move forward since any bot we
create won't block us from using GitHub.
Those four things are enough to move devinabox over. It probably is enough
for the benchmarks suite, but I have an email to speed@ asking if people
want to use this opportunity to re-evaluate the benchmark suite and make
any changes that will affect repo size (e.g., use pip to pull in the
libraries and frameworks used by a benchmark rather than vendoring their
code, making the repo much smaller).
Website-related stuff
================
This also almost gets us the peps repo, but we do need to figure out how to
change the website to build from the git checkout rather than an hg one.
Same goes for the devguide. It would be great if we can set up web hooks to
immediately trigger rebuilds of those portions of the sites instead of
having to wait until a cronjob triggers.
CPython requirements
=================
There are six things to work out before we move over cpython. First, do we
want to split out Python 2 branches into their own repo? There might be a
clone size benefit which obviously is nice for people on slow Internet
connections. It also clearly separates out Python 2 from 3 and lets those
who prefer to focus on one compared to the other do that more easily. It
does potentially make any single fix that spans 2 and 3 require a bit more
work since it won't be an intra-clone change. We could also contemplate
sub-repos for things like the Doc/ or Tools/ directories (although I doubt
it's worth it).
Second, do we want all fixes to go into master and then cherry-pick into
other branches, or do we want to follow our current practice of going into
the active feature branch and then merge into master? I personally prefer
the former and I think most everyone else does as well, but I thought it
should be at least thought about.
Third, how to handle Misc/NEWS? We can add a NEWS field to bugs.python.org
and then generate the NEWS file by what issues are tied to what version and
when they were closed. The other approach is to create a file per NEWS
entry in a version-specific directory (Larry created code for hg already
for this to give people an idea: http://bugs.python.org/issue18967). Then
when we cut a release we run a tool the slurps up all of the relevant files
-- which includes files in the directory for the next feature release which
represent fixes which were cherry picked -- and generates the NEWS file for
the final release. The per-file approach is bot-friendly and also
CLI-friendly, but potentially requires more tooling and I don't know if
people feel news entries should be tied to the issue or in the repo
(although that assumes people find tweaking Roundup easy :).
Fourth, we need to decide exactly what commands we expect core devs to run
initially for committing code. Since we agreed to a linear history we need
to document exactly what we expect people to do for a PR to get it into
their git repo. This will go into the devguide -- probably will want to
start a github branch at some point -- and will act as the commands the bot
will want to work off of.
Fifth, what to do about Misc/ACKS? Since we are using PRs, even if we
flatten them, I believe the PR creators will get credit in the commit as
the author while the core dev committing will be flagged as the person
doing the merge (someone correct me if I'm wrong because if I am this whole
point is silly). With the commits containing credit directly, we can either
automatically generate Misc/ACKS like the NEWS file or simply drop it for
future contributors and just leave the file for past contributors since git
will have kept track for us.
Six, we will need to update our Buildbot fleet.
This gets us to the bare minimum needed to function.
Parity with hg.python.org
----------------------------------
For parity, there are some Roundup integrations that will be necessary,
like auto-generating links, posting commits to #python-dev on IRC, etc. I'm
not sure if people want to block until that is all in place or not. I do
think we should make sure there is some web hook that can take an issue #
from the title of a PR and automatically posts to the corresponding issue
on bugs.python.org that the PR exists. If people disagree then feel free to
say so.
Adding perks
==========
Now we get to some added stuff that we never had on our own infrastructure.
:)
We should wire up CI for all PRs. I don't know if we want to go with
Travis, Codeship, or what CI provider, but we should definitely hook it up
and fully utilize the resource. This could even include running doctest
over the docs, making sure the reST markup is accurate, etc.
Do we need to set up a web hook to trigger website rebuilds? We should at
least have a mirror on Read the Docs that is triggered by web hook so that
we have a backup of the documentation (if possible; not sure how custom our
Sphinx setup is compared to what they require to work).
We should try to get test coverage wired up as well per CI. I don't know if
coveralls.io or some other provider is best, but we should see what is
available and find out if we can use them to either get basic coverage or
thorough coverage (read https://hg.python.org/devinabox/file/tip/README#l124 to
see what thorough coverage entails, but it does require a checkout of
coverage.py).
We should build a bot. It must use a Monty Python reference to trigger
(black-knight, none-shall-pass, man-from-scene-24, five-questions,
what-is-your-quest, what-is-your-favourite-colour, etc.; obviously I'm
leaning towards the Black Knight or Bridge of Death scenes from the Holy
Grail for inspiration since they deal with blocking you from doing
something). It should handle specifying the commit message, what branches
to commit to/cherry pick into, and a NEWS entry (if necessary). I don't
know if it needs to do anything else as a requirement. It should probably
implement a commit queue like Zuul or Homu (and both of those can be
considered as the basis of the bot). Also gating commits on passing a test
run probably would also be good.
I'm sure we will want to use some labels and milestones to track what PRs
are for what versions, if they are blocked on something, etc.
---
Long email! :) I think that is my current brain dumped in email form. As I
said at the beginning, I think we should focus on what is blocking the
easiest repos first and then just keep knocking down blockers as we try to
move over more repos.
16
42
Standard library separation from core (was Re: My initial thoughts on the steps/blockers of the transition)
by Nick Coghlan 05 Jan '16
by Nick Coghlan 05 Jan '16
05 Jan '16
On 5 January 2016 at 12:50, Nicholas Chammas <nicholas.chammas(a)gmail.com> wrote:
> Something else to consider. We’ve long talked about splitting out the stdlib
> to make it easier for the alternative implementations to import. If some or
> all of them also switch to git, we could do that pretty easily with git
> submodules.
>
> Not to derail here, but wasn’t there a discussion (perhaps on python-ideas)
> about slowly moving to a model where we distribute a barebones Python
> “core”, allowing the standard modules to be updated and released on a more
> frequent cycle? Would this be one small step towards such a model?
That discussion has been going on for years :)
The most extensive elaboration is in the related PEPs:
PEP 407 considered the idea of distinguishing normal releases and LTS
releases: https://www.python.org/dev/peps/pep-0407/
PEP 413 considered decoupling standard library versions from language
versions: https://www.python.org/dev/peps/pep-0413/
The ripple effect of either proposal on the wider community would have
been huge though, hence why 407 is Deferred and 413 Withdrawn.
Instead, the main step which has been taken (driven in no small part
by the Python 3 transition) is the creation of PyPI counterparts for
modules that see substantial updates that are backwards compatible
with earlier versions (importlib2, for example, lets you use the
Python 3 import system in Python 2). Shipping pip by default with the
interpreter runtime is also pushing people more towards the notion
that "if you're limiting yourself to the standard library, you're
experiencing only a fraction of what the Python ecosystem has to offer
you".
We don't currently do a great job of making those libraries
*discoverable* by end users, but they're available if you know to look
for them (there's an incomplete list at
https://wiki.python.org/moin/Python2orPython3#Supporting_Python_2_and_Pytho…
)
pip's inclusion was also the first instance of CPython shipping a
*bundled* library that isn't maintained through the CPython
development process - each new maintenance release of CPython ships
the latest upstream version of pip, rather than being locked to the
version of pip that shipped with the corresponding x.y.0 release.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
3
3
03 Jan '16
(Note: near term, this probably isn't a useful idea, as the current
GitHub & GitLab proposals aren't proposing the introduction of merge
gating, merely advisory testing of PRs, so folks will remain free to
merge patches locally and push them up to the master repository.
However, I think pre-merge testing is a good idea long term, with
approving our own PRs taking the place of pushing directly to the
development and maintenance branches)
I recently came across an old article [1] by Rust's Graydon Hoare
regarding their approach to pre-merge testing of commits using GitHub
PRs and Buildbot. Following up on that brought me to a more up to date
explanation of their current bot setup [2], and the http://homu.io/
service for running it against a project without hosting your own
instance (in our case, if we did run a service like Homu, we'd likely
still want to host our own instance, as Mozilla do for Rust and
Servo).
Similar to OpenStack's Zuul, Homu serialises commits, ensuring that
the test suite is passing *before* code lands on the target branch.
The relevant differences are that Homu is designed to use GitHub PRs
rather than Gerrit reviews as the event source, and Travis CI and
Buildbot rather than Jenkins as the CI backends for actually running
the integration tests. (Homu is also currently missing Zuul's
"speculative test pipeline" feature, which allows it to test multiple
commits in parallel on the assumption that most of them will pass due
to prior testing in isolation, but has a separate feature that allows
PRs to be tagged for inclusion in "rollup" commits that land several
changes at once after testing them as a group)
The current merge queue for Rust itself can be seen at
http://buildbot.rust-lang.org/homu/queue/rust
Regards,
Nick.
[1] http://graydon2.dreamwidth.org/1597.html
[2] http://huonw.github.io/blog/2015/03/rust-infrastructure-can-be-your-infrast…
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
3
7
I don't think this will be a shock to anyone who has followed the
discussion on this list. The decision is essentially based on:
1. No major distinguishing features between GitHub or GitLab
2. Familiarity amongst core devs -- and external contributors -- with
GitHub
3. Guido prefers GitHub
Neither platform had some mind-blowing feature(s) that really made them
stand out from each other such that it would greatly simplify our lives if
we chose one platform over another. I obviously was really hoping there was
going to be something I missed, but nothing ever came up (and no, being
open source is not enough of a feature; as I said when I started this
process, being open source would help break ties or minor lead of one tool
but not be a deciding factor).
But what Github does have over GitLab is familiarity. While there were
people who publicly said they would prefer not to go with GitHub but would
begrudgingly use it if we chose to go that route, I had multiple core devs
email me privately saying they hoped I would choose GitHub. I think most of
that stemmed from having used GitHub for other open source projects and/or
work, making even dormant core devs say they would be able to become active
again if we switched to GitHub thanks to eliminating the barrier of having
to keep up with our custom workflow for code reviews and using hg for
commits. And while I said it wasn't a goal to make things easier for
external contributors, I also can't ignore the fact that the vast majority
of people out there who might want to help out are already familiar with
GitHub.
And at least for me, the fact Guido prefers GitHub means something. While
Guido himself would say I shouldn't really worry about his preferences
since he is only an occasional contributor at this point, I believe that
it's important that our BDFL actually like contributing to his own
programming language rather than potentially alienating him because he
finds the process burdensome.
So that's why I have chosen GitHub over GitLab. Please realize that this is
choosing GitHub to provide repository hosting and code review; we are not
moving our issue tracker, nor are we moving our wiki. And the long-term
plan is to set up a bot that will handle our commit workflow which will
help isolate us from any repository hosting platform we are on and making
moving easier in the future (and short-term people will use the
command-line and that's totally platform-agnostic).
Thanks to everyone who contributed to this decision, especially Donald,
Barry, and Nick for making the proposals we had to work from.
We can start the discussion of how we want to handle the transition next
week, but I'm going to try and step away from this whole workflow topic
until Monday so I can spend the last couple of days of my vacation not
thinking about this stuff. :)
13
27
I don't know if they are going to ask to come under the python organization
or go do their own thing, but I'm going to make sure they know what's going
on over here.
1
0