[Python-ideas] Using only patches for pulling changes in hg.python.org
Dirkjan Ochtman
dirkjan at ochtman.nl
Sun Jul 4 15:46:53 CEST 2010
On 2010-07-04 12:47, Tarek Ziadé wrote:
> Once CPython itself is in mercurial, we will probably have the same
> problem when people are pulling contributions. If you use a "hg pull"
> command it will get all commits from the third party, even if some if
> those commits are unnecessary noise, like
> "I have removed this file. OOps I am putting the file back in..".
>
> And it's not so easy to edit the incoming changelog once they are
> commited. It's not easy either to use "hg incoming" because most of
> the time, the third party clone has many unrelated changes. I think we
> should work with queues and patches everywhere to solve this.
>
> The idea is to have contributors handling hg patches in
> bug.python.org, one patch per feature. They can use mq for that, and
> the benefit will be to have a very clean history in all repositories.
> A good thing about hg patches is that unlike simple diffs, the
> contributor name and comment appears in the final changelog.
Hmm, I don't think I agree on what you're saying.
First, a changeset is a changeset is a changeset. If you exchange them
as patches or in some other way (by pulling or pushing or whatever)
shouldn't really matter. This is one of the things DVCS is good at, you
can move csets around different clones in many ways, and all clones are
created equal.
Second, a noisy history is never good. So yes, pulling some kind of
messy history and pushing it to a central repo as-is is not a good idea.
People should polish their changesets so that each changeset can stand
on its own. So yes, somewhere between it being a messy history of actual
development and it going into a central repo, it should be cleaned up.
Ideally, the original author should do that, but if he's not in a
position to do so, the committer should do it.
Third, if the result of cleaning up is a single cset, it should probably
be rebased before getting pushed to a central repo. If it's two or three
csets, rebase it. On the other hand, if it's 10 csets, actually doing an
explicit merge makes sense. The idea is not to clutter up the history
with a merge every other cset, but if the merge is hard/non-trivial it
can make sense to leave it explicit.
Fourth, one-patch-per-issue is too restrictive. Small commits are useful
because they're way easier to review. Concatenate several small commits
leading up to a single issue fix into a single patch and it gets much
harder to read. Easy reviews are important, because a lot of valuable
time is spent reviewing. The simple example is a chain like
refactor-refactor-fix (which is IME quite common). Ideally each stage
keeps the test suite passing and is internally consistent, but moving
towards a common goal (to fix the issue).
So, I find your proposed policy somewhat vague and also not that
attractive. Cleaning up the history is certainly a good thing, but I
don't think we have to mandate a way for things to get into the repo.
Mandating the use of issues as a reference for each fix or enhancement
could be useful, but seems unnecessary.
Cheers,
Dirkjan
More information about the Python-ideas
mailing list