[Python-Dev] Primer on distributed revision control?
Stephen J. Turnbull
stephen at xemacs.org
Sat Mar 22 03:55:57 CET 2008
skip at pobox.com writes:
> With all these distributed revision control systems now available (bzr, hg,
> darcs, svk, many more), I find I need an introduction to the concepts and
> advantages of repository distribution.
> Can someone point me to some useful content (web pages or books)
> which will help me wrap my brain around the ideas? Maybe a
> compare/contrast of the major players?
Others have mentioned a bunch of resources. They're all OK, and
should reassure you that dVCS is not all that different from what
you're used to. Here I'll post some comments that as far as I know
are not in any of the existing resources.
One caveat that you should be very careful about while reading them:
any command that involves receiving changes from a remote repo and
updating your workspace. Terms like "get", "fetch", "pull", "update",
"clone", and even "checkout" are commonly used to describe these
operations, but the actual semantics of the commands named by those
terms differs wildly. For example, in git, "fetch" receives the
metadata and new content, while "pull" does a fetch, then attempts to
update the workspace and performs 3-way merges. In Mercurial, "pull"
and "fetch" have the opposite semantics.
Also, note that while CVS and Subversion automatically do a merge when
updating, the dVCSes all make this distinction between fetching new
content and merging it into your workspace. As already described,
some of them normally combine the operations, others tend to ask the
user to do them "by hand", but all provide both ways to do it. (That
should send a shiver down your Pythonic spine!)
Another one is that in git and Mercurial, "clone" replicates a repo,
while "checkout" switches between branches in a single workspace. In
bzr, "clone" is an alias for "checkout", and normally creates a new
> It's not obvious how I push changes back upstream.
It is very important to remember that in centralized systems like CVS
or Subversion, committing *is* pushing, while in a dVCS these
operations are separate. In many projects the workflow is that you
*record* your changes in a local repo, then you *push* them to a
shared repo. AFAIK all of the contenders do call the record operation
"commit", and the publish operation "push". (Other workflows you
will hear about are based on mailing patches -- eg, Linux, while yet
others are based on gatekeepers pulling from your repo -- Arch
developers tend to favor this. Don't worry about them, these are not
going to be relevant to Python for a while.)
>From a participating developer's point of view, Eric Raymond's
in-progress survey captures this aspect pretty well as a distinction
between "update before record" and "merge after record" (my terms, not
his). The point is that in CVS or Subversion, you *cannot* commit to
mainline if you haven't merged in all concurrent changes to mainline.
In a distributed VCS, on the other hand, you record your changes
locally at will, then merge revisions at your convenience, finally
pushing them upstream. It sound like a small difference, but it's
actually amazingly liberating.
Otherwise, it doesn't need to be any different from your current
workflow, and I doubt that it will be until the most active Python
developers start to feel a need for it.
> It seems to me that it has the potential for leading to anarchy,
git is *designed* around Linus's ability to manage chaos, but because
of Linus, there is no anarchy. The same will be true for Python
(although the personalities of the leading developers are different,
so the details of why no anarchy will differ). A more optimistic way
to put your point is that dVCSes have a great potential for
In general, you'll always have a pretty good idea where you want to
pull from, and the gatekeepers will tell if you may and where to push.
 Darcs uses "record" for the record operation, but Darcs is highly
unlikely to become the Python dVCS of choice.
More information about the Python-Dev