[Python-checkins] r72915 - peps/trunk/pep-0385.txt

dirkjan.ochtman python-checkins at python.org
Mon May 25 16:53:48 CEST 2009


Author: dirkjan.ochtman
Date: Mon May 25 16:53:48 2009
New Revision: 72915

Log:
PEP 385: Migrating to Mercurial (initial version).

Added:
   peps/trunk/pep-0385.txt

Added: peps/trunk/pep-0385.txt
==============================================================================
--- (empty file)
+++ peps/trunk/pep-0385.txt	Mon May 25 16:53:48 2009
@@ -0,0 +1,167 @@
+PEP: 385
+Title: Migrating from svn to Mercurial
+Version: $Revision: 72563 $
+Last-Modified: $Date: 2009-05-11 14:50:03 +0200 (Mon, 11 May 2009) $
+Author: Dirkjan Ochtman <dirkjan at ochtman.nl>
+Status: Active
+Type: Process
+Content-Type: text/x-rst
+Created: 25-May-2009
+
+.. warning::
+   This PEP is in the draft stages.
+
+
+Motivation
+==========
+
+After having decided to switch to the Mercurial DVCS, the actual migration
+still has to be performed. In the case of an important piece of
+infrastructure like the version control system for a large, distributed
+project like Python, this is a significant effort. This PEP is an attempt
+to describe the steps that must be taken for further discussion. It's
+equivalent to `PEP 347`_, which discussed the migration to SVN.
+
+To make the most of hg, I (Dirkjan) would like to make a high-fidelity
+conversion, such that (a) as much of the svn metadata as possible is
+retained, and (b) all metadata is converted to formats that are common in
+Mercurial. This way, tools written for Mercurial can be optimally used. In
+order to do this, I want to use the `hgsubversion`_ software to do an initial
+conversion. This hg extension is focused on providing high-quality conversion
+from Subversion to Mercurial for use in two-way correspondence, meaning it
+doesn't throw away as much available metadata as other solutions.
+
+Such a conversion also seems like a good time to reconsider the contents of
+the repository and determine if some things are still valuable. In this spirit,
+the following sections also propose discarding some of the older metadata.
+
+.. _PEP 347: http://www.python.org/dev/peps/pep-0347/
+.. _hgsubversion: http://bitbucket.org/durin42/hgsubversion/
+
+
+Transition plan
+===============
+
+Branch strategy
+---------------
+
+Mercurial has two basic ways of using branches: cloned branches, where each
+branch is kept in a separate directory, and named branches, where each revision
+keeps metadata to note on which branch it belongs. The former makes it easier
+to distinguish branches, at the expense of requiring more disk space on the
+client. The latter makes it a little easier to switch between branches, but
+often has somewhat unintuitive results for people (though this has been
+getting better in recent versions of Mercurial).
+
+For Python, I think it would work well to have cloned branches and keep most
+things separate. This is predicated on the assumption that most people work on
+just one (or maybe two) branches at a time. Branches can be exposed separately,
+though I would advocate merging old (and tagged!) branches into mainline so
+that people can easily revert to older releases. At what age of a release this
+should be done can be debated (a natural point might be when the branch gets
+unsupported, e.g. 2.4 at the release of 2.6).
+
+Converting branches
+-------------------
+
+There are quite a lot of branches in SVN's branches directory. I propose to
+clean this up a bit, by employing the following the strategy:
+
+* Keep all release (maintenance) branches
+* Discard branches that haven't been touched in 18 months, unless somone
+  indicates there's still interest in such a branch
+* Keep branches that have been touched in the last 18 months, unless someone
+  indicates the branch can be deprecated
+
+Converting tags
+---------------
+
+The SVN tags directory contains a lot of old stuff. Some of these are not, in
+fact, full tags, but contain only a smaller subset of the repository. I think
+we should keep all release tags, and consider other tags for inclusion based
+on requests from the developer community. I'd like to consider unifying the
+release tag naming scheme to make some things more consistent, if people feel
+that won't create too many problems.
+
+Author map
+----------
+
+In order to provide user names the way they are common in hg (in the 'First Last
+<user at example.org>' format), we need an author map to map cvs and svn user
+names to real names and their email addresses. I have a complete version of such
+a map in my `migration tools repository`_. The email addresses in it might be
+out of date; that's bound to happen, although it would be nice to try and
+have as many people as possible review it for addresses that are out of date.
+The current version also still seems to contain some encoding problems.
+
+.. _migration tools repository: http://hg.xavamedia.nl/cpython/pymigr/
+
+Generating .hgignore
+--------------------
+
+The .hgignore file can be used in Mercurial repositories to help ignore files
+that are not eligible for version control. It does this by employing several
+possible forms of pattern matching. The current Python repository already
+includes a rudimentary .hgignore file to help with using the hg mirrors.
+
+It might be useful to have the .hgignore be generated automatically from
+svn:ignore properties. This would make sure all historic revisions also have
+useful ignore information (though one could argue ignoring isn't really
+relevant to just checking out an old revision).
+
+Revlog reordering
+-----------------
+
+As an optional optimization technique, we should consider trying a reordering
+pass on the revlogs (internal Mercurial files) resulting from the conversion.
+In some cases this results in dramatic decreases in on-disk repository size.
+
+Other repositories
+------------------
+
+Richard Tew has indicated that he'd like the Stackless repository to also be
+converted. What other projects in the svn.python.org repository should be
+converted? Do we want to convert the peps repository? distutils? others?
+
+
+Infrastructure
+==============
+
+hg-ssh
+------
+
+Developers should access the repositories through ssh, similar to the current
+setup. Public keys can be used to grant people access to a shared hg@ account.
+A hgwebdir instance should also be set up for easy browsing and read-only
+access. Some facility for sandboxes/incubator repositories could be discussed.
+
+Hooks
+-----
+
+A number of hooks is currently in use. The hg equivalents for these should be
+developed and deployed. The following hooks are being used:
+
+* check whitespace: a hook to reject commits in case the whitespace doesn't
+  match the rules for the Python codebase. Should be straightforward to
+  re-implement from the current version. Open issue: do we check only the tip
+  after each push, or do we check every commit in a changegroup?
+
+* commit mails: we can leverage the notify extension for this
+
+* buildbots: both the regular and the community build masters must be notified.
+  Fortunately buildbot includes support for hg. I've also implemented this for
+  Mercurial itself, so I don't expect problems here.
+
+* check contributors: in the current setup, all changesets bear the username of
+  committers, who must have signed the contributor agreement. In a DVCS, the
+  committers are not necessarily the same people who push, and so we can't
+  check if the committer is a contributor. We could use a hook to check if the
+  committer is a contributor if we keep a list of registered contributors.
+
+hgwebdir
+--------
+
+A more or less stock hgwebdir installation should be set up. We might want to
+come up with a style to match the Python website. It may also be useful to
+build a quick extension to augment the URL rev parser so that it can also take
+r[0-9]+ args and come up with the matching hg revision.


More information about the Python-checkins mailing list