Please read/comment/vote. This circulated as a pre-PEP proposal
submitted to c.l.py on August 10, but has changed quite a bit since
then. I'm reposting this since it is now "Open (under consideration)"
at http://www.python.org/peps/pep-0350.html.
Thanks!
--
Micah Elliott <mde at tracos.org>
PEP: 350
Title: Codetags
Version: $Revision: 1.2 $
Last-Modified: $Date: 2005/09/26 19:56:53 $
Author: Micah Elliott <mde at tracos.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 27-Jun-2005
Post-History: 10-Aug-2005, 26-Sep-2005
Abstract
========
This informational PEP aims to provide guidelines for consistent use
of *codetags*, which would enable the construction of standard
utilities to take advantage of the codetag information, as well as
making Python code more uniform across projects. Codetags also
represent a very lightweight programming micro-paradigm and become
useful for project management, documentation, change tracking, and
project health monitoring. This is submitted as a PEP because its
ideas are thought to be Pythonic, although the concepts are not unique
to Python programming. Herein are the definition of codetags, the
philosophy behind them, a motivation for standardized conventions,
some examples, a specification, a toolset description, and possible
objections to the Codetag project/paradigm.
This PEP is also living as a wiki_ for people to add comments.
What Are Codetags?
==================
Programmers widely use ad-hoc code comment markup conventions to serve
as reminders of sections of code that need closer inspection or
review. Examples of markup include ``FIXME``, ``TODO``, ``XXX``,
``BUG``, but there many more in wide use in existing software. Such
markup will henceforth be referred to as *codetags*. These codetags
may show up in application code, unit tests, scripts, general
documentation, or wherever suitable.
Codetags have been under discussion and in use (hundreds of codetags
in the Python 2.4 sources) in many places (e.g., c2_) for many years.
See References_ for further historic and current information.
Philosophy
==========
If you subscribe to most of these values, then codetags will likely be
useful for you.
1. As much information as possible should be contained **inside the
source code** (application code or unit tests). This along with
use of codetags impedes duplication. Most documentation can be
generated from that source code; e.g., by using help2man, man2html,
docutils, epydoc/pydoc, ctdoc, etc.
2. Information should be almost **never duplicated** -- it should be
recorded in a single original format and all other locations should
be automatically generated from the original, or simply be
referenced. This is famously known as the Single Point Of
Truth (SPOT) or Don't Repeat Yourself (DRY) rule.
3. Documentation that gets into customers' hands should be
**auto-generated** from single sources into all other output
formats. People want documentation in many forms. It is thus
important to have a documentation system that can generate all of
these.
4. The **developers are the documentation team**. They write the code
and should know the code the best. There should not be a
dedicated, disjoint documentation team for any non-huge project.
5. **Plain text** (with non-invasive markup) is the best format for
writing anything. All other formats are to be generated from the
plain text.
Codetag design was influenced by the following goals:
A. Comments should be short whenever possible.
B. Codetag fields should be optional and of minimal length. Default
values and custom fields can be set by individual code shops.
C. Codetags should be minimalistic. The quicker it is to jot
something down, the more likely it is to get jotted.
D. The most common use of codetags will only have zero to two fields
specified, and these should be the easiest to type and read.
Motivation
==========
* **Various productivity tools can be built around codetags.**
See Tools_.
* **Encourages consistency.**
Historically, a subset of these codetags has been used informally in
the majority of source code in existence, whether in Python or in
other languages. Tags have been used in an inconsistent manner with
different spellings, semantics, format, and placement. For example,
some programmers might include datestamps and/or user identifiers,
limit to a single line or not, spell the codetag differently than
others, etc.
* **Encourages adherence to SPOT/DRY principle.**
E.g., generating a roadmap dynamically from codetags instead of
keeping TODOs in sync with separate roadmap document.
* **Easy to remember.**
All codetags must be concise, intuitive, and semantically
non-overlapping with others. The format must also be simple.
* **Use not required/imposed.**
If you don't use codetags already, there's no obligation to start,
and no risk of affecting code (but see Objections_). A small subset
can be adopted and the Tools_ will still be useful (a few codetags
have probably already been adopted on an ad-hoc basis anyway). Also
it is very easy to identify and remove (and possibly record) a
codetag that is no longer deemed useful.
* **Gives a global view of code.**
Tools can be used to generate documentation and reports.
* **A logical location for capturing CRCs/Stories/Requirements.**
The XP community often does not electronically capture Stories, but
codetags seem like a good place to locate them.
* **Extremely lightweight process.**
Creating tickets in a tracking system for every thought degrades
development velocity. Even if a ticketing system is employed,
codetags are useful for simply containing links to those tickets.
Examples
========
This shows a simple codetag as commonly found in sources everywhere
(with the addition of a trailing ``<>``)::
# FIXME: Seems like this loop should be finite. <>
while True: ...
The following contrived example demonstrates a typical use of
codetags. It uses some of the available fields to specify the
assignees (a pair of programmers with initials *MDE* and *CLE*), the
Date of expected completion (*Week 14*), and the Priority of the item
(*2*)::
# FIXME: Seems like this loop should be finite.
Micah Elliott
``FIXME (XXX, DEBUG, BROKEN, REFACTOR, REFACT, RFCTR, OOPS, SMELL, NEEDSWORK, INSPECT)`` *Fix me*: Areas of problematic or ugly code needing refactoring or cleanup.
I think the standard should not have codetags that are synonyms. This is Python and there should be only one way to do it. One problem with synonyms is that they makes it harder to search using tools like grep. Neil
On 9/26/05, Neil Schemenauer
Micah Elliott
wrote: ``FIXME (XXX, DEBUG, BROKEN, REFACTOR, REFACT, RFCTR, OOPS, SMELL, NEEDSWORK, INSPECT)`` *Fix me*: Areas of problematic or ugly code needing refactoring or cleanup.
I think the standard should not have codetags that are synonyms. This is Python and there should be only one way to do it. One problem with synonyms is that they makes it harder to search using tools like grep.
It has always been my choice to *only* use XXX. I hope there aren't any developers contributing to the Python core who use any others? I honestly don't see much of a point for distinguishing different types; these are for humans to read and review, and different tags just makes it harder to grep. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
It has always been my choice to *only* use XXX. I hope there aren't any developers contributing to the Python core who use any others?
[Python-2.4.1]$ grep FIXME */*.c */*.py | wc -l 12 [Python-2.4.1]$ grep TODO */*.c */*.py | wc -l 17 [Python-2.4.1]$ grep XXX */*.c */*.py | wc -l 525
I honestly don't see much of a point for distinguishing different types; these are for humans to read and review, and different tags just makes it harder to grep.
I tend to use FIXME for smelly code, and a collection of TODO:s at the top of the file for things that someone should work on some day... (which explains some, but not all, of the non-XXX:es above...) </F>
On Monday 26 September 2005 05:35 pm, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at http://www.python.org/peps/pep-0350.html.
Overall, it looks good, but my objection would be:
:Objection: I aesthetically dislike for the comment to be terminated with <> in the empty field case.
:Defense: It is necessary to have a terminator since codetags may be followed by non-codetag comments. Or codetags could be limited to a single line, but that's prohibitive. I can't think of any single-character terminator that is appropriate and significantly better than <>. Maybe ``@`` could be a terminator, but then most codetags will have an unnecessary @.
The <> terminator is evil. People will hate that. If there are no fields, you should just be able to leave it off. This will have an additional advantage in that many will already have compliant codetags if you leave off this requirement. You worry over the need to detect the end of the block, but wouldn't '\n\n' be a much more natural delimiter? I.e.: # TODO: This is a multi-line todo tag. # You see how I've gone to the next line. # This, on the other hand is an unrelated comment. You can tell it's not # related, because there is an intervening blank line. I think people # do this naturally when writing comments (I know I do -- I'm fairly # certain I've seen other people do it). # # Whereas, as you can see, a mere paragraph break can be represented by # a blank comment line. # # Whitespace formatting, after all, is VERY PYTHONIC. ;-) # Delimiters on the other hand -- well, we prefer not to mention # the sort of languages that use those, right? ;-) Another possibility is to recognize lines like: #--------------------------------------- #*************************************** #======================================= I.e. a comment mark followed by a line composed of repeating characters as an alternative separator. These are also pretty in pretty common use. Cheers, Terry -- Terry Hancock ( hancock at anansispaceworks.com ) Anansi Spaceworks http://www.anansispaceworks.com
Terry Hancock wrote:
On Monday 26 September 2005 05:35 pm, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at http://www.python.org/peps/pep-0350.html.
Overall, it looks good, but my objection would be:
:Objection: I aesthetically dislike for the comment to be terminated with <> in the empty field case.
:Defense: It is necessary to have a terminator since codetags may be followed by non-codetag comments. Or codetags could be limited to a single line, but that's prohibitive. I can't think of any single-character terminator that is appropriate and significantly better than <>. Maybe ``@`` could be a terminator, but then most codetags will have an unnecessary @.
The <> terminator is evil. People will hate that. If there are no fields, you should just be able to leave it off. This will have an additional advantage in that many will already have compliant codetags if you leave off this requirement.
You worry over the need to detect the end of the block, but wouldn't '\n\n' be a much more natural delimiter? I.e.:
# TODO: This is a multi-line todo tag. # You see how I've gone to the next line.
# This, on the other hand is an unrelated comment. You can tell it's not # related, because there is an intervening blank line. I think people # do this naturally when writing comments (I know I do -- I'm fairly # certain I've seen other people do it). # # Whereas, as you can see, a mere paragraph break can be represented by # a blank comment line. # # Whitespace formatting, after all, is VERY PYTHONIC. ;-) # Delimiters on the other hand -- well, we prefer not to mention # the sort of languages that use those, right? ;-)
+1
Another possibility is to recognize lines like:
#--------------------------------------- #*************************************** #=======================================
I.e. a comment mark followed by a line composed of repeating characters as an alternative separator. These are also pretty in pretty common use.
-0 regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.pycon.org
At 03:35 PM 9/26/2005 -0700, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at http://www.python.org/peps/pep-0350.html.
This seems a little weird to me. On the one hand, seems like a cool idea if you aren't using Eclipse or another IDE that tracks this stuff, but still need some kind of tracking system. But, if that is the case, the notation seems a little bit overkill, especially with respect to tracking who's responsible - i.e., just you. If you have a team that can agree to use the tools, I suppose it might be useful, but then I wonder, why not use something like Trac? Finally, I think there should be something besides just a comment to distinguish these things; like starting with a symbol (e.g. # !FIXME), so that that documentation extraction tools can distinguish code tags from other documentation that just happens to start with a CAPITALIZED word. Overall, I'm kind of -0.5. It seems like a spec in search of an application. The Motivation is sorely lacking - it reads like, "hey, it's optional and you can do stuff", where the stuff you can do is deferred to a later section, and is mostly stuff that could easily be done in other ways. For example, FIT-style acceptance test documents, or Python doctest files go a long way towards documenting stories in association with tests, and they don't require you to cram things into comments. (And because they're executable tests, they get kept up-to-date.) Tracking bugfixes with code history is handled nicely by tools like Trac. There are lots of Python documentation tools already. And so on. Really, it reads to me like you came up with the features to sell the format, instead of designing the format to implement specific features. My suggestion: implement some tools, use them for a while, and come back with more focused use cases to show why only this format can work, and why the Python core developers should therefore use it. I'm not saying that you can't have an informational PEP unless it should be used in the stdlib, mind you. Just pointing out that if you can't convince the core developers it's useful, I'm thinking you'll have a hard time convincing the community at large to actually use it. You need to actually have a better mousetrap to present before you ask people to move their cheese. :)
"Phillip J. Eby"
At 03:35 PM 9/26/2005 -0700, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at http://www.python.org/peps/pep-0350.html.
This seems a little weird to me. On the one hand, seems like a cool idea if you aren't using Eclipse or another IDE that tracks this stuff, but still need some kind of tracking system. But, if that is the case, the notation seems a little bit overkill, especially with respect to tracking who's responsible - i.e., just you.
There are various Python editors which have had support for a similar style of tags for quite a while. Some allow #<anything>: <comment>, others allow # <alphanumeric and spaces> : <comment>, even others allow more or less. Some even count exclamation points as an indicator of severity. Personally, though I use tags in some of the code I write, and though the editor I use (and wrote) supports tags, I'm of the opinion that an unofficial spec is sufficient. See koders.com and search for 'fixme' to see some common variants. - Josiah
Thanks to all who have read and/or provided feedback. There have been
some great ideas and admonitions that hadn't crossed my mind. I'll
paraphrase some of the responses for the sake of brevity; I don't mean
to misquote anyone.
Tom> ISO 8601 includes a week notation.
That's great. Thanks for pointing it out; seems like it should be the
preferred week format. I might also want to allow a simple W42 (say)
since it's so much shorter, and I'll consider myself generally in
trouble if I wrap around on the year for due items.
Terry> Shorter forms such as DO, FIX, DEF, DOC, etc. are better.
I like the short proposals, so I'll add, and possibly even canonize
them. My proposed canons were based on popular use and also my taste.
I had originally wanted to state that only the canonical forms would
be supported and use of the synonyms should be deprecated. That would
have simplified things a bit (human and tool parsing). But based on
people's ideas about what is canonical, that would never have flown.
Instead, it seems useful to just list everything that's ever been used
as long as it *fits one of the categories*. And the canon is mostly
arbitrary/unnecessary; we'd never settle that peacefully anyway. The
main point is that official categorization enables construction of
productivity tools.
Terry> IDEXXX isn't vim/emacs. Offer an x:"comment" field for a
Terry> completed item.
Bengt> Later a tool can strip this out to the devlog.txt or DONE
Bengt> file, when the tool sees an added progress line like
Bengt> # ---: woohoo, completed ;-)
At 09:10 AM 9/28/2005 -0700, Micah Elliott wrote:
I agree that proof of value is necessary. Without a spec though it will be hard to get people to know about a convention/toolset, so it's a bit of a chicken-egg problem -- I can't have a pep until the tools are in use, but the tools won't be used until programmers have means/motivation to use them, a pep.
My point about the lack of motivation was that there was little reason shown why this should be a PEP instead of either: 1. Documentation for a specific tool, or group of tools 2. A specific project's process documentation Are you proposing that this format be used by the Python developers for Python itself? A process spec like this seems orthogonal to Python-the-language. To put it another way, this seems like writing a PEP on how to do eXtreme Programming, or perhaps a PEP on how the blogging "trackback" protocol works. Certainly you might implement those things using Python, but the spec itself seems entirely orthogonal to Python. I don't really see why it's a PEP, as opposed to just a published spec on your own website, unless you intend for say, the Python stdlib to conform to it.
On 9/29/05, Phillip J. Eby
My point about the lack of motivation was that there was little reason shown why this should be a PEP instead of either:
1. Documentation for a specific tool, or group of tools 2. A specific project's process documentation
That's what I feel as well. I hadn't commented on the PEP as I had simply intended to ignore it totally in my own projects... Paul.
Micah Elliott
Josiah> an unofficial spec is sufficient. See koders.com and search Josiah> for 'fixme' to see some common variants.
But that's the problem -- there are already a bunch of "unofficial" specs, which don't serve much purpose as such. It's a cool site. I spent some time browsing and I do see a lot of codetags in use (many thousands in Python code alone; I'm not sure if the number represented strings or files). But I'll argue that this just reinforces the need for an *official* spec/guideline, so that the tools can do something with them.
Defining a spec for code tags doesn't mean that people will start using them. Why? Because it is a documentation spec. From my experience, documentation specs are only adhered to by the organizations (companies, groups, etc.) which the code is produced by and for, and they generally define the code specs for their organization. Further, even if it becomes a spec, it doesn't guarantee implementation in Python editors (for which you are shooting for). Take a wander through current implementations of code tags in various editors to get a feel for what they support. I've read various posts about what code tags could support, but not what editors which implement code tags and/or its variants actually currently support; which is a better indication of what people want than an informal survey via email of python-dev, python-list, and/or the PEP submission process. - Josiah
Josiah Carlson wrote:
Further, even if it becomes a spec, it doesn't guarantee implementation in Python editors (for which you are shooting for). Take a wander through current implementations of code tags in various editors to get a feel for what they support. I've read various posts about what code tags could support, but not what editors which implement code tags and/or its variants actually currently support; which is a better indication of what people want than an informal survey via email of python-dev, python-list, and/or the PEP submission process.
An approach to this area that would make sense to me is: 1. Defer PEP 350 2. Publish a simple Python module for finding and processing code tags in a configurable fashion 3. Include a default configuration in the module that provides the behaviour described in PEP 350 4. After this hypothetical code tag processing module has been out in the wild for a while, re-open PEP 350 with an eye to including the module in the standard library The idea is that it should be possible to tailor the processing module in order to textually scan a codebase (possibly C or C++ rather than Python) in accordance with a project-specific system of code tagging, rather than requiring that the project necessarily use the default style included in the processing module (Although using a system other than the default one may result in reduced functionality, naturally). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com
On 9/30/05, Nick Coghlan
An approach to this area that would make sense to me is:
1. Defer PEP 350 2. Publish a simple Python module for finding and processing code tags in a configurable fashion 3. Include a default configuration in the module that provides the behaviour described in PEP 350 4. After this hypothetical code tag processing module has been out in the wild for a while, re-open PEP 350 with an eye to including the module in the standard library
The idea is that it should be possible to tailor the processing module in order to textually scan a codebase (possibly C or C++ rather than Python) in accordance with a project-specific system of code tagging, rather than requiring that the project necessarily use the default style included in the processing module (Although using a system other than the default one may result in reduced functionality, naturally).
Maybe I'm just an old fart, but this all seems way over-engineered. Even for projects the size of Python, a simple grep+find is sufficient. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
On 9/30/05, Nick Coghlan
wrote: An approach to this area that would make sense to me is:
1. Defer PEP 350 2. Publish a simple Python module for finding and processing code tags in a configurable fashion 3. Include a default configuration in the module that provides the behaviour described in PEP 350 4. After this hypothetical code tag processing module has been out in the wild for a while, re-open PEP 350 with an eye to including the module in the standard library
The idea is that it should be possible to tailor the processing module in order to textually scan a codebase (possibly C or C++ rather than Python) in accordance with a project-specific system of code tagging, rather than requiring that the project necessarily use the default style included in the processing module (Although using a system other than the default one may result in reduced functionality, naturally).
Maybe I'm just an old fart, but this all seems way over-engineered.
Even for projects the size of Python, a simple grep+find is sufficient.
I expect many people would agree with you, but Micah was interested enough in the area to write a PEP about it. The above was just a suggestion for a different way of looking at the problem, so that writing a PEP would actually make sense. At the moment, if the tags used are project-specific, and the method used to find them is a simple grep+find, then I don't see a reason for the idea to be a *Python* Enhancement Proposal. Further, I see some interesting possibilities for automation if such a library exists. For example, a cron job that scans the checked in sources, and automatically converts new TODO's to RFE's in the project tracker, and adds a tracker cross-link into the source code comment. The job could similarly create bug reports for FIXME's. If the project tracker was one that supported URL links, and the project had a URL view of the source tree, then the cross-links between the code tag and the tracker could be actual URL references to each other. However, the starting point for exploring any such ideas would be a library that made it easier to work with code tags. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com
On 9/30/05, Nick Coghlan
Further, I see some interesting possibilities for automation if such a library exists. For example, a cron job that scans the checked in sources, and automatically converts new TODO's to RFE's in the project tracker, and adds a tracker cross-link into the source code comment. The job could similarly create bug reports for FIXME's. If the project tracker was one that supported URL links, and the project had a URL view of the source tree, then the cross-links between the code tag and the tracker could be actual URL references to each other.
With all respect for the OP, that's exactly the kind of enthusiastic over-engineering that I'm afraid the PEP will encourage. I seriously doubt that any of that work will contribute towards a project's success (compared to simply having a convention of putting XXX in the code). -- --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (11)
-
Fredrik Lundh
-
Guido van Rossum
-
Josiah Carlson
-
Micah Elliott
-
Micah Elliott
-
Neil Schemenauer
-
Nick Coghlan
-
Paul Moore
-
Phillip J. Eby
-
Steve Holden
-
Terry Hancock