[Python-Dev] XXX do we need a new policy?
glyph at divmod.com
glyph at divmod.com
Tue Nov 4 11:43:13 CET 2008
On 3 Nov, 11:44 pm, guido at python.org wrote:
>On Mon, Nov 3, 2008 at 3:39 PM, Benjamin Peterson
><musiccomposition at gmail.com> wrote:
>>Grepping through Python's sources tells me that we have over 2,000
>>So, I propose that we adopt a policy similar to Twisted's: All "XXX"
>>comments must have an issue in the tracker and an accompanying link
>>with the source code.
>That seems excessively draconian. (...) personally, I
>don't see XXX comments necessarily as "to be resolved" -- merely as
>flags for someone perusing the code looking to change it or digging
>for the cause of some problem to pay special attention.
For what it's worth, in my experience this is one of the least draconian
areas of Twisted policy ;-). I'll describe our reasoning for the policy
and the effect that it's had.
If a developer working on a feature is inclined to type
# somebody, do something to something else
then the intent is typically to say "this is a confusing snippet of
code, it bears explanation beyond what is clearly readable in the code
itself". Documenting the "why" and "what" happens in the docstring, so
we only have comments in areas of code where the "how" or the "who"
(i.e. "only JP and Glyph understand this") is difficult to understand.
In Twisted, this is almost always some wicked busted operating system
API, like how you get undocumented errnos out of send(), recv() or
select() on some platform.
However, if they're inclined to type
# TODO: somebody should really do more stuff here
most commonly, what this means is "before I merge this feature, I really
want to fix this potential issue so we don't need to deal with it in the
future". The policy applies to "XXX" because in my personal experience
it's always used as a synonym for "TODO".
My subjective impression is that in 75% of the cases where a comment
like this is caught in a code review, the original author of the branch
had actually forgotten they had added a comment like this and will
eagerly go and fix it, to avoid dealing with the issues that it might
raise in a release in the future. I can't recall a case where an author
(myself included) was not happy to be alerted to a comment like this.
In a significant minority of cases, although the author has still
usually forgotten (if they had remembered they would have filed a ticket
preemptively, after all), the task in question is actually too big or
too esoteric to bother with in the same branch, so it gets deferred.
Filing a ticket is the natural extension of this, so that we don't lose
track of the task in question.
The consequence of this policy has been higher quality bugfixes (because
contributors don't forget what they're doing), and increased visibility
into problems in our codebase(s) (since we use this on Divmod projects
too) which directly translates into less duplicate grumbling about the
same issue over and over again in pointless comments.
Also, I hear a lot of kvetching about Twisted policy, especially review
latency (I hear you guys are familiar with that problem as well), but as
I recall nobody has ever complained that this was too much work. When
we instituted the policy, as with most of our policies, there is a
grandfather clause: we didn't go back and trawl through all of the
existing XXX and TODO comments, we just made it so you couldn't add
_new_ XXX and TODO comments without filing and linking to a ticket.
I haven't seen "XXX" used as a way to say "pay special attention"
because our use of comments is extremely spare. If there's a comment at
all that means the code is sufficiently wacky that it warrants special
attention. Also, we don't write much code in C, so the preponderance of
"sufficiently wacky" code is small ;).
Looking at the Python codebase, though, (specifically the 2.6 release),
there are definitely "XXX" comments which do not appear to be actionable
tasks by their authors. Guido's certainly right about that: not all XXX
comments in Python are "to be resolved". But, a subjective assessment
of the first couple of pages of such comments indicates that a majority
of them are. A lot of them are cryptic, too. One of the benefits of
review scrutiny of comments is a chance for the reviewer to pipe up and
say "This comment is incomprehensible. Can you explain it (on a ticket)
in a way that someone else might fix it, rather than a note to
yourself?" (with the implication being "a note to yourself that nobody,
including you six months later, is ever going to understand").
I think that a Twisted reviewer encountering a comment like /* XXX From
here until type is allocated, "return NULL" leaks bases! */ would
probably say something like "XXX should probably be something more
informative, like 'WARNING:' or 'CAREFUL:'". Clearly if XXX is being
used in this way there's no point in filing a ticket, and the comment
here is perfectly clear; the policy is not intended to cover this case.
In fact it might be good to have a whole special comment convention for
notes about the constantly horrific tightrope-walk over lava that is
memory-management and object lifecycles in C. If anything, "XXX" is a
bit too easy to glaze over in a codebase with >2000 "XXX" comments. But
I don't write this type of code too often, so maybe to a more fluent
speaker of C "/* XXX memory junk */" is an idiomatic thing to see.
Personally I'd recommend that Python adopt such a policy, but mainly for
your own sanity. It's not something that I think would make a huge
difference to anyone outside the core development group, except perhaps
to generate smaller, easier issues for new contributors to work on for
practice. Certainly not something I'm going to push for - my intent
here is just to share our experience.
More information about the Python-Dev