[Python-Dev] We should be using a tool for code reviews

Thu Sep 30 18:53:00 CEST 2010

On Thu, Sep 30, 2010 at 9:33 AM, Barry Warsaw <barry at python.org> wrote:
> On Sep 30, 2010, at 10:47 AM, Jesse Noller wrote:
>
>>Not to mention; there's a lot to be learned from doing them on both
>>sides. At work, I learn about chunks of code I might not have
>>otherwise known about or approaches to a problem I'd never considered.
>>I sort of drank the kool-aid.
>
> Tools aside, I completely agree.
>
> Many projects that I contribute to have policies such as "nothing lands
> without reviewer approval".  For some that means one reviewer must approve it,
> for others two +1s and no -1s, or a coding approval and a ui approval, etc.
>
> When I was the review team lead on Launchpad, I had a goal that every
> developer would also eventually be a reviewer.  We started with a small number
> of experienced developers, then ran a mentor program to train new reviewers
> (who we called "mentats").  A mentat approval was not enough to land a branch,
> but the mentor could basically say "yes, i agree with the review" and it would
> land.  Eventually, by mutual consent a mentat would graduate to full
> reviewership (and hopefully be a mentor to new reviewers).
>
> This was hugely successful among many dimensions.  It got everyone on the same
> page as to coding standards, it equalized the playing field, it got everyone
> to think collaboratively as a team, folks learned about parts of the system
> they didn't have day-to-day intimate knowledge about, and it got changes
> landed much more quickly.
>
> Here are a few things we learned along the way.  Their applicability to Python
> will vary of course, and we need to find what works for us.
>
> * Keep branches *small*.  We had a limit of 800 lines of diff, with special
>  explicit permission from the person reviewing your change to exceed.  800
>  lines is about the maximum that a person can review in a reasonable amount
>  of time without losing focus.
>
> * The *goal* was 5 minutes review, but the reality was that most reviews took
>  about 15-20 minutes.  If it's going longer, you weren't doing it right.
>  This meant that there was a level of trust between the reviewer and the dev,
>  so that the basic design had been previously discussed and agreed upon (we
>  mandated pre-implementation chats), etc.  A reviewer was there to sanity
>  check the implementation, watch for obvious mistakes, ensure test coverage
>  for the new code, etc.  If you were questioning the basic design, you
>  weren't doing a code review.  It was okay to reject a change quickly if you
>  found fatal problems.
>
> * The primary purpose of a code review was to learn and teach, and in a sense,
>  the measurable increase in quality was a side-effect.  What I mean by that
>  is that the review process cannot catch all bugs.  It can reduce them, but
>  it's much more valuable to share expertise on how to do things.  E.g. "Did
>  you know that if X happens, you won't be decref'ing Y?  We like to use goto
>  statements to ensure that all objects are properly refcounted even in the
>  case of exceptional conditions."  That's a teaching moment that happens to
>  improve quality.
>
> * Reviews are conversations, and it's a two way street.  Many times a dev
>  pushed back on one of my suggestions, and we'd have weekly reviewer meetings
>  to hash out coding standards differences.  E.g. Do you check for empty
>  sequences with "if len(foo) == 0" or "if not foo"?  The team would make
>  those decisions and you'd live by them even if you didn't agree.  It's also
>  critical to document those decisions, and a wiki page style guide works very
>  well (especially when you can point to PEP 8 as your basic style guide for
>  example).
>
> * Reviews are collaborations.  You're both there to get the code landed so
>  work together, not at odds.  Try to reach consensus, and don't be afraid to
>  compromise.  All code is ultimately owned by everyone and you never know who
>  will have to read it 2 years from now, so keep things simple, clear, and
>  well commented.  These are all aesthetics that a reviewer can help with.
>
> * As a reviewer ASK QUESTIONS.  The best reviews were the ones that asked lots
>  of questions, such as "have you thought about the race conditions this might
>  introduce?" or "what happens when foo is None?"  A reviewer doesn't
>  necessarily have to guess or work out every detail.  If you don't understand
>  something, ask a question and move on.  Let the coder answer it to your
>  satisfaction.  As a reviewer, once all your questions are answered, you know
>  you can approve the change.
>
> * Keep reviews fast, easy, and fun.  I think this is especially true for
>  Python, where we're all volunteers.  Keeping it fast, easy, and fun greatly
>  improves the odds that code will be reviewed for the good of the project.
>
> * Have a dispute resolution process.  If a reviewer and coder can't agree,
>  appeal to a higher authority.  As review team leader, I did occasionally
>  have to break ties.
>
> * Record the reviewer in the commit messages.  We had highly structured commit
>  messages that included the reviewer name, e.g.
>
>  % commit -m"[r=barry] Bug 12345; fix bad frobnification in Foo.bar()"
>
>  thus recording in the revision history both the coder and the reviewer, so
>  that we could always ask someone about the change.
>
> * Don't let changes get stale.  We had a goal that changes would go from
>  ready-to-code (i.e. design and implementation strategy have already been
>  worked out) to landed in 2 days.  The longer a change goes before landing,
>  the more stale it gets, the more conflicts you can have, and the less
>  relevant the code becomes.
>
> I know this sounds like a lot of process, but it really was fairly lightweight
> in practice.  And that's the most important!  Keep things fast, easy, and fun
> and it'll get done.  Also, these ideas evolved after 3 years of
> experimentation with various approaches, so let it take some time to evolve.
> And don't be so married to process that you're afraid to ditch steps that are
> wasteful and don't contribute value to the project.
>
> Certainly some of our techniques won't be relevant for Python.  For example,
> we assigned people to do nothing but reviews for one day out of the week (we
> call it "on-call reviewers").  This worked for us because team velocity was
> much more important than individual velocity.  Even though at first blush, it
> seemed like you personally were going to be 20% less productive, team
> productivity skyrocketed because changes were landing much faster, with much
> less waste built in.  This probably won't work for Python where our
> involvement is not governed by paycheck, but by the whims of our real jobs and
> life commitments.

Extremely well put. Could this kind of process be put in place for the
code sprints Jesse's interested in? Seems like an ideal testbed.

Geremy Condra