[Python-Dev] Mercurial migration: progress report (PEP 385)

Sat Jul 4 05:20:04 CEST 2009

Dirkjan Ochtman writes:
 > On Fri, Jul 3, 2009 at 15:29, Stephen J. Turnbull<stephen at xemacs.org> wrote:
 > > I'll have to try them again now that 1.3 is out, but I found Mercurial
 > > named branches fundamentally unsuited to release management.
 > 
 > Can you explain why, please? It's not clear from what you say
 > below.

Well, the main problem I had was one that you say has improved: the
various issues of confusion due to the presence of multiple active
heads in a single repository.

IME, Mercurial strongly encourages a non-branching style.  Although I
can't fully explain in concrete terms what makes me feel that way,
it's certainly consistent with your own inclination to advise "subset
branches".  Part of it comes from the fact that you can't have a
single revision on two branches.  I would really like the node of a
release branch to be on both the branch and the mainline so that the
tag appears in the history of both, but that's not possible.

Another issue (which you say has improved) is the handling of multiple
heads.  With unnamed heads, it's just too painful to leave them
hanging around, so you merge them immediately (in fact, all XEmacs
committers currently active use either "pull -u" or the fetch
extension).

With named branches, the additional heads do tend to hang around.  I
found that named branches tended to get inadvertantly pushed, and
worse, they'd end up being the tip, which Mercurial treats specially.
In one case a completely private branch inadvertantly got pushed into
my "pristine" clone of the public repo, and from there into the public
repo, where (since it had been the tip in my private workspace) it
ended up as the tip in public.  Embarrassing, to say the least.
Fortuntely, the branch was more or less ready to be pushed anyway, but
one of my colleagues ended up working on that branch without realizing
that he wasn't on (his) mainline any more, and wondered why some
previously done work suddenly disappeared.  A good time was had by
nobody involved.

I don't know if that has been fixed in hg; the experience was painful
enough that my workflow adjusted immediately.

 > > Ditto named branches.  The problem is that (unless the internal
 > > implementation has changed very recently) a Mercurial revision can be
 > > on exactly one named branch (or on the trunk).
 > 
 > That's still true.
 > 
 > > Which defeats the purpose of having named branches, really.  (I mean
 > > the version control purpose; obviously it still can save disk space.)
 > 
 > Why does it defeat the purpose? What, in your opinion, is the
 > purpose?

I use named branches to collect a sequence of revisions as a named
object, for viewing and manipulation, as differentiated from some
other sequence, for *various* values of "some other sequence".  The
problem is that suppose you have a branch A off the trunk, and you
then (several revisions down the line of development) branch B off A.
Now A meanders off and runs into "not ready for prime time" problems,
while B just swims along.  Problem is, you can't easily find the
history of B relative to the trunk because much of its history (since
forking from the *trunk*) is labeled A.

 > > Unless you're really short on space, though, that's not a big deal.
 > > What would be more important to me (not that I matter for the purpose
 > > of Python, but in XEmacs -- also a Mercurial shop -- I do :-) would be
 > > the other way around: pulling an external branch into a named branch.
 > > I have a feeling that working with such a repository with others would
 > > be a little difficult.
 > 
 > Can you give an example?

No, I haven't tried it.  What concerns me is that I suspect that the
branch name will end up as part of the revision's internal identifier,
and that means that if you and I separately create the same named
branch we have to choose exactly the same name or the branches won't
be recognized as the same, resulting in the mother of all spurious
merge conflicts.

 > > As others (MvL, I think) have commented, this isn't really relevant to
 > > Python which generally has four mainlines going at once.  I don't see
 > > why the requirements are going to change with the shift to hg, and I
 > > see no reason why hg won't handle the existing workflow just fine.
 > 
 > It will handle it, for sure, but I think it would all go easier if we
 > could work with stricter subset branches (and leave the effective
 > cherrypicking for the occasional problem).

Sure, but what do you propose?  That we nuke Python 3.1, 3.2, and 2.6?
They're all pretty divergent from 2.7 by now, as well as from each
other.  Not to mention the "security branches" that are still around.
And individual developers are for sure going to do interesting things
in their private workspaces.

I see that George Brandl and Martin van Loewis seem to be accomodating
your viewpoint, but I don't get the impression that either you (as the
hg migration proponent) nor they (as core committers) realize how far
apart your assumptions are.  You are talking about the Mercurial
project, which has *one* line of development.  There are many such
projects; the ones I'm most familiar with are XEmacs and Scheme48,
which have adopted a subset branch approach different the Mercurial
project's (XEmacs is modeled on Scheme48).  Works well enough (when
I'm wearing my release manager hat; it's a little constraining as a
developer used to the insane flexibility of git).

But Python is *not* such a project.  The problem is not coordinating
concurrent development of a closeknit group of committers all working
on the same mainline.  In Python, there are *four* mainlines with
rather different purposes, and a diverse group of developers, some who
work on only one line of development, others who work on several, and
a few who accept the role of coordinating across them.