[Python-Dev] XXX do we need a new policy?

glyph at divmod.com glyph at divmod.com
Tue Nov 4 11:43:13 CET 2008

On 3 Nov, 11:44 pm, guido at python.org wrote:
>On Mon, Nov 3, 2008 at 3:39 PM, Benjamin Peterson
><musiccomposition at gmail.com> wrote:
>>Grepping through Python's sources tells me that we have over 2,000
>>"XXX" comments.

>>So, I propose that we adopt a policy similar to Twisted's: All "XXX"
>>comments must have an issue in the tracker and an accompanying link
>>with the source code.

>That seems excessively draconian. (...) personally, I
>don't see XXX comments necessarily as "to be resolved" -- merely as
>flags for someone perusing the code looking to change it or digging
>for the cause of some problem to pay special attention.

For what it's worth, in my experience this is one of the least draconian 
areas of Twisted policy ;-).  I'll describe our reasoning for the policy 
and the effect that it's had.

If a developer working on a feature is inclined to type

    # somebody, do something to something else

then the intent is typically to say "this is a confusing snippet of 
code, it bears explanation beyond what is clearly readable in the code 
itself".  Documenting the "why" and "what" happens in the docstring, so 
we only have comments in areas of code where the "how" or the "who" 
(i.e. "only JP and Glyph understand this") is difficult to understand. 
In Twisted, this is almost always some wicked busted operating system 
API, like how you get undocumented errnos out of send(), recv() or 
select() on some platform.

However, if they're inclined to type

    # TODO: somebody should really do more stuff here

most commonly, what this means is "before I merge this feature, I really 
want to fix this potential issue so we don't need to deal with it in the 
future".  The policy applies to "XXX" because in my personal experience 
it's always used as a synonym for "TODO".

My subjective impression is that in 75% of the cases where a comment 
like this is caught in a code review, the original author of the branch 
had actually forgotten they had added a comment like this and will 
eagerly go and fix it, to avoid dealing with the issues that it might 
raise in a release in the future.  I can't recall a case where an author 
(myself included) was not happy to be alerted to a comment like this.

In a significant minority of cases, although the author has still 
usually forgotten (if they had remembered they would have filed a ticket 
preemptively, after all), the task in question is actually too big or 
too esoteric to bother with in the same branch, so it gets deferred. 
Filing a ticket is the natural extension of this, so that we don't lose 
track of the task in question.

The consequence of this policy has been higher quality bugfixes (because 
contributors don't forget what they're doing), and increased visibility 
into problems in our codebase(s) (since we use this on Divmod projects 
too) which directly translates into less duplicate grumbling about the 
same issue over and over again in pointless comments.

Also, I hear a lot of kvetching about Twisted policy, especially review 
latency (I hear you guys are familiar with that problem as well), but as 
I recall nobody has ever complained that this was too much work.  When 
we instituted the policy, as with most of our policies, there is a 
grandfather clause: we didn't go back and trawl through all of the 
existing XXX and TODO comments, we just made it so you couldn't add 
_new_ XXX and TODO comments without filing and linking to a ticket.

I haven't seen "XXX" used as a way to say "pay special attention" 
because our use of comments is extremely spare.  If there's a comment at 
all that means the code is sufficiently wacky that it warrants special 
attention.  Also, we don't write much code in C, so the preponderance of 
"sufficiently wacky" code is small ;).

Looking at the Python codebase, though, (specifically the 2.6 release), 
there are definitely "XXX" comments which do not appear to be actionable 
tasks by their authors.  Guido's certainly right about that: not all XXX 
comments in Python are "to be resolved".  But, a subjective assessment 
of the first couple of pages of such comments indicates that a majority 
of them are.  A lot of them are cryptic, too.  One of the benefits of 
review scrutiny of comments is a chance for the reviewer to pipe up and 
say "This comment is incomprehensible.  Can you explain it (on a ticket) 
in a way that someone else might fix it, rather than a note to 
yourself?" (with the implication being "a note to yourself that nobody, 
including you six months later, is ever going to understand").

I think that a Twisted reviewer encountering a comment like /* XXX From 
here until type is allocated, "return NULL" leaks bases! */ would 
probably say something like "XXX should probably be something more 
informative, like 'WARNING:' or 'CAREFUL:'".  Clearly if XXX is being 
used in this way there's no point in filing a ticket, and the comment 
here is perfectly clear; the policy is not intended to cover this case.

In fact it might be good to have a whole special comment convention for 
notes about the constantly horrific tightrope-walk over lava that is 
memory-management and object lifecycles in C.  If anything, "XXX" is a 
bit too easy to glaze over in a codebase with >2000 "XXX" comments.  But 
I don't write this type of code too often, so maybe to a more fluent 
speaker of C "/* XXX memory junk */" is an idiomatic thing to see.

Personally I'd recommend that Python adopt such a policy, but mainly for 
your own sanity.  It's not something that I think would make a huge 
difference to anyone outside the core development group, except perhaps 
to generate smaller, easier issues for new contributors to work on for 
practice.  Certainly not something I'm going to push for - my intent 
here is just to share our experience.

More information about the Python-Dev mailing list