[IPython-dev] Pull request workflow...

Fernando Perez fperez.net at gmail.com
Sun Oct 10 17:01:03 EDT 2010

Hi all,

We're starting to get into a very good grove with github-based pull
requests.  With a cleanly constructed request, I can pretty much
review it, merge it and push with about *1 minute* total worth of
tool-based overhead.  That is, the time I need now for a review is
pretty much whatever the actual code requires in terms of
reading/thinking/testing/discussing, plus a negligible amount to apply
the merge, close the ticket and push.  Emphasis on 'cleanly
constructed'.  If the review isn't cleanly constructed, the overhead
balloons to an unbounded amount.  I recently had to review some code
on the numpy datarray project where I spent *multiple hours* wading
through a complex review.  In that case I did it because the original
author had already kindly waited for too long on me, so I decided to
take the hit and not impose further work on his part, but what should
have taken at most 30 minutes burned multiple hours this week.

So from this experience, I'd like to summarize what I've learned so
far, so that we can all hit a good stride in terms of workflow, that
balances the burden on the person offering the contribution and the
time it takes to accept it.  My impression is that if you want to
propose a pull request that will go down very easily and will be
merged with minimal pain on all sides, you want to:

- keep the work on your branch completely confined to one specific
topic, bugfix or feature implementation.  Git branches are cheap and
easy to make, do *not* mix in one branch more than one topic.  If you
do, you force the reviewer to disentangle unrelated functionality
scattered across multiple commits.  This doesn't mean that a branch
can't touch multiple files or have many commits, simply that all the
work in a branch should be related to one specific 'task', be it
refactoring, cleanup, bugfix, feature implementation, whatever.  But:
'one task, one branch'.

- name your branches sensibly: the merge commit message will have the
name of the branch by default.  It's better to read 'merge branch
john-print-support' or 'merge branch john-fix-gh-123' than 'merge
branch john-master'.

- *Never* merge back from trunk into your feature branch.  When you
merge from trunk (sometimes repeatedly), it makes the history graph
extremely, and unnecessarily, complicated.  For example for this pull
request (and I'm *not* picking on Thomas, he did a great job, this is
just part of the learning process for everybody):


his repeated merge commits simply created a lot of unnecessary noise
and complexity in the graph.  I was able to rebase his branch on top
of trunk and then merge it with a clean graph, but that step would
have been saved by Thomas not merging from trunk in the first place.

Obviously you want to make sure that your work will merge into trunk
cleanly before submitting it, but you can test that in your local repo
by simply merging your work into your personal master (or any
throw-away that tracks trunk) and testing.  But don't publish in your
request your merges *from* trunk.  Since your branch is meant to go
back *into* trunk, doing that will just create a criss-cross mess.

If you absolutely need to merge something from trunk (because it has
fixes you need for your own work), then rebase on top of trunk before
making your pull request, so that your branch applies cleanly on top
of trunk as a self-contained unit without criss-crossing.

For reviewers:

- edit the merge message before you push, by adding a short
explanation of what the branch did, along with a final 'Closes
gh-XXX.' string, so the pull request is auto-closed and the ticket and
closing commit get automatically linked.

Feedback on these notes is welcome, and even more welcome will be when
someone updates our dev documents with a clean pull request that
include our collective wisdom, guiding by example :)



More information about the IPython-dev mailing list