[Python-Dev] Fwd: Distributed RCS
Guido van Rossum
gvanrossum at gmail.com
Sun Aug 14 02:11:05 CEST 2005
Another fwd, describing how Steve Alexander's group user bazaar.
--Guido van Rossum (home page: http://www.python.org/~guido/)
---------- Forwarded message ----------
From: Steve Alexander <steve at canonical.com>
Date: Aug 12, 2005 4:00 PM
Subject: Re: Distributed RCS
To: Guido van Rossum <gvanrossum at gmail.com>
Cc: Mark Shuttleworth <mark at canonical.com>, Martin Pool
<mbp at canonical.com>, Fredrik Lundh <fredrik at pythonware.com>
I'm not going to post to python-dev just now, because I'm leaving on 1.5
weeks vacation tomorrow, and I'd rather be absent than unable to answer
Martin Pool will be around next week, and will be able to take part in
discussions on the list.
Feel free to post all or part of Mark's or my emails to the python lists.
> I hope that puts bazaar into perspective for you. Give it a spin - the
> development 2.x codebase is robust enough now to handle a line of
> development and do basic merging, we are switching our own development
> to the pre-release 2.x line in October, and we will switch over all the
> public archives we maintain in around March next year.
A large part of the internal development at Canonical is the Launchpad
system. This is about 30-40 kloc of Python code, including various
Twisted services, cron scripts, a Zope 3 web application, database
It's being worked on by 20 software developers. Everyone uses bazaar
1.4 or 1.5, and around October, we'll be switching to use bazaar 2.x.
I'll describe how we work on Launchpad using Bazaar. This is all from
the Bazaar 1.x perspective, and some things will become simpler when we
change to using Bazaar 2.x.
I've left the description quite long, as I hope it will give you some of
the flavour of working with a distributed RCS.
== Two modes of working: shared branches and PQM ==
Bazaar supports two different modes of working for a group like the
1. There's a shared read/write place that all the developers have access
to. This is contains the branches we release from, and represents the
"trunk" of the codebase.
2. A "virtual person" called the "patch queue manager" (PQM) has
exclusive write access to a collection of branches. PQM takes
instructions as GPG signed emails from launchpad developers, to merge
their code into PQM's branches.
We use the latter mode because we have PQM configured not only to accept
requests to merge code into PQM's codebase, but to run all the tests
first and refuse to merge if any test fails.
== The typical flow of work on Launchpad ==
Say I want to work on some new feature for Launchpad. What do I do?
1. I use 'baz switch' to change my working tree from whatever I was
working on last, and make it become PQM's latest code.
baz switch rocketfuel at canonical.com/launchpad--devel--0
"rocketfuel" is the code-name for the branches we release our
code from. PQM manages the rocketfuel branches. In Bazaar 1.x,
collections of branches are called "archives" and are identified
by an email address plus some other optional information.
So, "rocketfuel at canonical.com" is PQM's email address.
"launchpad--devel--0" is simply the name of the main launchpad
branch. The format of branch names is very strict in Bazaar 1.x.
It is much more open in Bazaar 2.x.
2. I use 'baz branch' to create my own branch of this code that I can
commit changes to.
baz branch steve.alexander at canonical.com/launchpad--ImproveLogins--0
My archive is called "steve.alexander at canonical.com". The branch
will be used to work on the login functionality of Launchpad, so
I have named the branch "launchpad--ImproveLogins--0".
3. I hack on the code, and from time to time commit my changes. I need
to 'baz add' new files and directories, and 'baz rm' to remove files,
and 'baz mv' to move files around.
# hack hack hack
baz commit -s "Refactored the whatever.py module."
# hack hack hack
baz del whatever_deprecated.py
baz commit -s "Removed deprecated whatevers."
# hack hack hack
4. Let's say I hacked on some stuff, but I didn't commit it. I don't
like what I did, and I want to start again.
# hack hack hack
'baz undo' puts the source code back into the state it was in after the
last commit, and puts the changes somewhere. If I change my mind again,
I can say 'baz redo', and get my changes back.
5. All this hacking and committing has been happening on my own
workstation, without a connection to the internet. Perhaps I've been on
a plane or at a cafe. When I have a connection again, I can make my
work available for others to see by mirroring my code to a central
location. Each Launchpad developer has a mirror of the archive they use
for Launchpad work on a central machine at the Canonical data centre.
In our case, the mirror command uses sftp to copy the latest changes I
have made into the mirror on this central server.
6. Because we have a strict code review proccess for Launchpad
development, I can't (or rather, shouldn't) submit my changes to PQM
yet. I should get it reviewed. But, let's say Andrew wants to do some
work that depends on my work, before my work has made its way into PQM's
rocketfuel "Trunk". He can simply merge from me.
# in Andrew's working tree, on his workstation.
baz merge steve.alexander at canonical.com/launchpad--ImproveLogins--0
baz commit -s "Merged steve's ImproveLogins work."
When Andrew eventually gets his work reviewed, and sends it on to PQM
to be merged into Rocketfuel, the Bazaar merging algorithms will work
out that Andrew merged from me, and will sort things out. Of course,
there can be conflicts when people have worked in divergent ways on the
same code. These are resolved in a similar way to CVS or SVN.
7. I want to get my code reviewed by a member of the review team. I add
the details of my branch to the PendingReviews page on the launchpad
development wiki. This wiki is publicly readable.
There is a script that periodically reads the PendingReviews page,
attempts to merge the branches listed there into rocketfuel (just as PQM
would do), and produces a diff for use by the review team. The diff
represents what changes would be made to the rocketfuel Trunk were the
branch in question to be sent to PQM. This diff is often enough for the
reviewers to work with. If they need to see more context, they can
simply check out the branch in question using 'baz get branchname'.
The script also highlights whether there were any conflicts that would
prevent a merge, and gives an indication of the size of the change.
The script's output is accessible only to Launchpad developers.
However, I've made a couple of screenshots to give you some idea of what
it looks like.
This is the summary page, that uses information taken from the
PendingReviews wiki page.
This is a typical diff representing what is to be merged.
The reviewer sends an email to the author of the code, cc the
launchpad-reviews mailing list. The review email typically has sections
of code included, each line prefixed with '> ', with comments, questions
and requests for improvement beneath each section of code. The reviewer
will either approve the code for merging, approve the code providing
certain remedial actions are taken, or reject the code, requiring a new
8. My code has been successfully reviewed by JamesH, so I send a signed
mail to PQM asking to merge my work into rocketfuel.
submit-merge "r=JamesH, Improvements to logging in." pqm at pqm.ubuntu.com
PQM checks that each merge request has r=someone in the message, as a
reminder that launchpad developers need to have their code reviewed.
The submit-merge script gets takes the archive name, the branch name,
and the "patch level" that the branch is at, composes an email saying
"pqm, please merge
steve.alexander at canonical.com/launchpad--ImproveLogins-0--patch-18
into rocketfuel at canonical.com/launchpad--devel--0."
Signs it with my gpg key, and mails it.
Some time later, once PQM has merged the code and successfully run all
the launchpad tests, an email will go out to me, and to a pqm-commits
mailing list, saying that the merge was successful. If it was
unsuccessful, I get an email with the error output.
An irc robot listens to the pqm-commits mailing list, and announces new
landings to the rocketfuel Trunk on irc.
== Naming branches ==
The Launchpad team is distributed around the world. To cope with this,
and also to get our community of users involved in the development of
the software, Launchpad development emphasises writing specifications
and proposals, and implementing features based on these proposals.
You can read all the launchpad proposals on the launchpad development wiki.
So, we usually name branches after the specification that is being
implemented on that branch. The branch is named near the top of the
specification, so someone reading the specification who has access to
the source code can see what's happening with the implementation.
Branches are also often named after bugs. For example,
The use of '--' in branch names, and the '--0' thing at the end is
occassionally useful, but more of a hangover from the 'tla' system that
bazaar is based on. This strict branch naming format is not being
carried over into bazaar 2.
== External contributors ==
The source code to Launchpad is not available at this time. We intend
to make it open source at some point in the future, but I'm not sure
when that will be.
Let's consider what would happen if we decided to make the Launchpad
code fully open source tommorrow.
Someone from outside of Canonical could get a copy of the main launchpad
"rocketfuel" branch, make their own branch by branching from the
rocketfuel branch, do a bunch of work, mirror it to their own website,
and email a Canonical launchpad developer to ask that it be reviewed, or
merged into that launchpad developer's branch.
This way, even though an outside contributor doesn't have rights with
PQM, they could still make fine-grained commits, merge frmo a variety of
places, and participate at the same level as someone employed by Canonical.
More information about the Python-Dev