[Python-ideas] Using only patches for pulling changes in hg.python.org

Thu Jul 8 01:22:00 CEST 2010

Georg Brandl writes:
 > Am 06.07.2010 07:16, schrieb Stephen J. Turnbull:
 > > Georg Brandl writes:
 > >  > Am 04.07.2010 17:26, schrieb Antoine Pitrou:
 > >  > > On Sun, 04 Jul 2010 15:46:53 +0200
 > >  > > Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
 > >  > >> 
 > >  > >> Fourth, one-patch-per-issue is too restrictive. Small
 > >  > >> commits are useful because they're way easier to
 > >  > >> review. Concatenate several small commits leading up to a
 > >  > >> single issue fix into a single patch and it gets much
 > >  > >> harder to read.
 > >  > > 
 > >  > > I don't agree with that. The commits obviously won't be
 > >  > > independent because they will be motivated by each other (or
 > >  > > even dependent on each other), therefore you have to
 > >  > > remember what the other commits do when reviewing one of
 > >  > > them. What's more, when reading "hg log" months or years
 > >  > > later, it is hard to make sense of a single commit because
 > >  > > you don't really know what issue it was meant to contribute
 > >  > > to fix.
 > >  > > 
 > >  > > I know that's how Mercurial devs do things, but I don't
 > >  > > really like it.
 > >  > 
 > >  > I think the best of both worlds is to encourage contributors
 > >  > to send more complicated patches in a series of easy-to-review
 > >  > steps, but when committing to Python, make one changeset out
 > >  > of them.
 > > 
 > > I don't see how this addresses Antoine's problem of connecting
 > > commits to issues at all.
 > 
 > I wasn't addressing Antoine's original problem, rather his reply to
 > Dirkjan.

Huh?  Are you referring to something other than the part of his post
that you quoted?  Antoine writes "you have to remember what the other
commits do when reviewing them" and "it is hard to make sense of a
single commit [in a series] because you don't know what issue it was
meant to fix".

I admit I'm not really sure what his issue is.  It seems to me that
connecting commits is what a feature branch (in conjunction with
rebase) is designed to achieve.  If you don't like rebase, you can
either work fast enough that your whole sequence is done before the
mainline moves on significantly, or you can refrain from updating
until done (and have a potentially messy merge), or you can use MQ
(which is really just a way of rebasing without the shame ;-).

I'll have to test it, but AFAIK in all of the above strategies, as
long as you don't push to the public repo until done, the logs of the
commits on the feature branch should all be adjacent in the natural
order of hg log.  That seems to me to be the optimal strategy, in
combination with reading long parts of history in a graphical DAG
browser.

Of course, that assumes that random pieces of the fix aren't dispersed
among commits.  In that case the logs will still be hard to read and
understand, as will the diffs.  People who like to commit early and
often should indeed be encouraged to edit their feature branches to
make each individual commit make sense to reviewers.  (MQ helps to
address this, as does Bazaar's loom feature or StGit.)  Feature
branches don't automatically organize commits in an intelligible way,
that requires an intelligence driving the process.  But they do make
it possible.

Once you have feature branches, then there's a question of the
external issue.  Here reviewers should pay attention to the log
message, and make sure it describes the problem well, and includes
cross references to any documentation (tracker issue or ML thread).
But that's no different from the current process.

I think that in many cases the process of coming up with coherent
changesets that are reviewable will indeed result in a single commit
to address the whole issue.  But there will also be multicommit
patterns that make sense, such as "refactor API and update current
clients -> use new feature in a few places".  The thing to remember is
that DVCSes not only record a frozen view of history accurately, but
can also be used to flexibly reorganize the presentation of that
history "as it should have happened".

I think of these workflows as opportunities to *improve* the quality
of information presented by the history.  But they aren't mandated by
adopting hg.  Contributors and reviewers who are satisfied with the
current process should continue to refine a set of changes to a single
commit.  hg is certainly flexible enough to allow that, with several
different workflows.  And Antoine's worries (AIUI) are not unfounded.
Eg, we should not allow people to be lazy and submit a feature branch
with changes randomly assigned to different commits and log messages
like "Lunch time! commit progress to date."  But that's a social
problem; I think that conventions will quickly evolve *from* the one
patch per issue workflow to a *well-organized* feature branch per
issue (as appropriate) because python-dev reviewers will demand it.