
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at <http://www.python.org/peps/pep-0350.html>. Thanks! -- Micah Elliott <mde at tracos.org> PEP: 350 Title: Codetags Version: $Revision: 1.2 $ Last-Modified: $Date: 2005/09/26 19:56:53 $ Author: Micah Elliott <mde at tracos.org> Status: Draft Type: Informational Content-Type: text/x-rst Created: 27-Jun-2005 Post-History: 10-Aug-2005, 26-Sep-2005 Abstract ======== This informational PEP aims to provide guidelines for consistent use of *codetags*, which would enable the construction of standard utilities to take advantage of the codetag information, as well as making Python code more uniform across projects. Codetags also represent a very lightweight programming micro-paradigm and become useful for project management, documentation, change tracking, and project health monitoring. This is submitted as a PEP because its ideas are thought to be Pythonic, although the concepts are not unique to Python programming. Herein are the definition of codetags, the philosophy behind them, a motivation for standardized conventions, some examples, a specification, a toolset description, and possible objections to the Codetag project/paradigm. This PEP is also living as a wiki_ for people to add comments. What Are Codetags? ================== Programmers widely use ad-hoc code comment markup conventions to serve as reminders of sections of code that need closer inspection or review. Examples of markup include ``FIXME``, ``TODO``, ``XXX``, ``BUG``, but there many more in wide use in existing software. Such markup will henceforth be referred to as *codetags*. These codetags may show up in application code, unit tests, scripts, general documentation, or wherever suitable. Codetags have been under discussion and in use (hundreds of codetags in the Python 2.4 sources) in many places (e.g., c2_) for many years. See References_ for further historic and current information. Philosophy ========== If you subscribe to most of these values, then codetags will likely be useful for you. 1. As much information as possible should be contained **inside the source code** (application code or unit tests). This along with use of codetags impedes duplication. Most documentation can be generated from that source code; e.g., by using help2man, man2html, docutils, epydoc/pydoc, ctdoc, etc. 2. Information should be almost **never duplicated** -- it should be recorded in a single original format and all other locations should be automatically generated from the original, or simply be referenced. This is famously known as the Single Point Of Truth (SPOT) or Don't Repeat Yourself (DRY) rule. 3. Documentation that gets into customers' hands should be **auto-generated** from single sources into all other output formats. People want documentation in many forms. It is thus important to have a documentation system that can generate all of these. 4. The **developers are the documentation team**. They write the code and should know the code the best. There should not be a dedicated, disjoint documentation team for any non-huge project. 5. **Plain text** (with non-invasive markup) is the best format for writing anything. All other formats are to be generated from the plain text. Codetag design was influenced by the following goals: A. Comments should be short whenever possible. B. Codetag fields should be optional and of minimal length. Default values and custom fields can be set by individual code shops. C. Codetags should be minimalistic. The quicker it is to jot something down, the more likely it is to get jotted. D. The most common use of codetags will only have zero to two fields specified, and these should be the easiest to type and read. Motivation ========== * **Various productivity tools can be built around codetags.** See Tools_. * **Encourages consistency.** Historically, a subset of these codetags has been used informally in the majority of source code in existence, whether in Python or in other languages. Tags have been used in an inconsistent manner with different spellings, semantics, format, and placement. For example, some programmers might include datestamps and/or user identifiers, limit to a single line or not, spell the codetag differently than others, etc. * **Encourages adherence to SPOT/DRY principle.** E.g., generating a roadmap dynamically from codetags instead of keeping TODOs in sync with separate roadmap document. * **Easy to remember.** All codetags must be concise, intuitive, and semantically non-overlapping with others. The format must also be simple. * **Use not required/imposed.** If you don't use codetags already, there's no obligation to start, and no risk of affecting code (but see Objections_). A small subset can be adopted and the Tools_ will still be useful (a few codetags have probably already been adopted on an ad-hoc basis anyway). Also it is very easy to identify and remove (and possibly record) a codetag that is no longer deemed useful. * **Gives a global view of code.** Tools can be used to generate documentation and reports. * **A logical location for capturing CRCs/Stories/Requirements.** The XP community often does not electronically capture Stories, but codetags seem like a good place to locate them. * **Extremely lightweight process.** Creating tickets in a tracking system for every thought degrades development velocity. Even if a ticketing system is employed, codetags are useful for simply containing links to those tickets. Examples ======== This shows a simple codetag as commonly found in sources everywhere (with the addition of a trailing ``<>``):: # FIXME: Seems like this loop should be finite. <> while True: ... The following contrived example demonstrates a typical use of codetags. It uses some of the available fields to specify the assignees (a pair of programmers with initials *MDE* and *CLE*), the Date of expected completion (*Week 14*), and the Priority of the item (*2*):: # FIXME: Seems like this loop should be finite. <MDE,CLE d:14w p:2> while True: ... This codetag shows a bug with fields describing author, discovery (origination) date, due date, and priority:: # BUG: Crashes if run on Sundays. # <MDE 2005-09-04 d:14w p:2> if day == 'Sunday': ... Here is a demonstration of how not to use codetags. This has many problems: 1) Codetags cannot share a line with code; 2) Missing colon after mnemonic; 3) A codetag referring to codetags is usually useless, and worse, it is not completable; 4) No need to have a bunch of fields for a trivial codetag; 5) Fields with unknown values (``t:XXX``) should not be used:: i = i + 1 # TODO Add some more codetags. # <JRNewbie 2005-04-03 d:2005-09-03 t:XXX d:14w p:0 s:inprogress> Specification ============= This describes the format: syntax, mnemonic names, fields, and semantics, and also the separate DONE File. General Syntax -------------- Each codetag should be inside a comment, and can be any number of lines. It should not share a line with code. It should match the indentation of surrounding code. The end of the codetag is marked by a pair of angle brackets ``<>`` containing optional fields, which must not be split onto multiple lines. It is preferred to have a codetag in ``#`` comments instead of string comments. There can be multiple fields per codetag, all of which are optional. .. NOTE: It may be reasonable to allow fields to fit on multiple lines, but it complicates parsing and defeats minimalism if you use this many fields. In short, a codetag consists of a mnemonic, a colon, commentary text, an opening angle bracket, an optional list of fields, and a closing angle bracket. E.g., :: # MNEMONIC: Some (maybe multi-line) commentary. <field field ...> Mnemonics --------- The codetags of interest are listed below, using the following format: | ``recommended mnemonic (& synonym list)`` | *canonical name*: semantics ``TODO (MILESTONE, MLSTN, DONE, YAGNI, TBD, TOBEDONE)`` *To do*: Informal tasks/features that are pending completion. ``FIXME (XXX, DEBUG, BROKEN, REFACTOR, REFACT, RFCTR, OOPS, SMELL, NEEDSWORK, INSPECT)`` *Fix me*: Areas of problematic or ugly code needing refactoring or cleanup. ``BUG (BUGFIX)`` *Bugs*: Reported defects tracked in bug database. ``NOBUG (NOFIX, WONTFIX, DONTFIX, NEVERFIX, UNFIXABLE, CANTFIX)`` *Will Not Be Fixed*: Problems that are well-known but will never be addressed due to design problems or domain limitations. ``REQ (REQUIREMENT, STORY)`` *Requirements*: Satisfactions of specific, formal requirements. ``RFE (FEETCH, NYI, FR, FTRQ, FTR)`` *Requests For Enhancement*: Roadmap items not yet implemented. ``IDEA`` *Ideas*: Possible RFE candidates, but less formal than RFE. ``??? (QUESTION, QUEST, QSTN, WTF)`` *Questions*: Misunderstood details. ``!!! (ALERT)`` *Alerts*: In need of immediate attention. ``HACK (CLEVER, MAGIC)`` *Hacks*: Temporary code to force inflexible functionality, or simply a test change, or workaround a known problem. ``PORT (PORTABILITY, WKRD)`` *Portability*: Workarounds specific to OS, Python version, etc. ``CAVEAT (CAV, CAVT, WARNING, CAUTION)`` *Caveats*: Implementation details/gotchas that stand out as non-intuitive. ``NOTE (HELP)`` *Notes*: Sections where a code reviewer found something that needs discussion or further investigation. ``FAQ`` *Frequently Asked Questions*: Interesting areas that require external explanation. ``GLOSS (GLOSSARY)`` *Glossary*: Definitions for project glossary. ``SEE (REF, REFERENCE)`` *See*: Pointers to other code, web link, etc. ``TODOC (DOCDO, DODOC, NEEDSDOC, EXPLAIN, DOCUMENT)`` *Needs Documentation*: Areas of code that still need to be documented. ``CRED (CREDIT, THANKS)`` *Credits*: Accreditations for external provision of enlightenment. ``STAT (STATUS)`` *Status*: File-level statistical indicator of maturity of this file. ``RVD (REVIEWED, REVIEW)`` *Reviewed*: File-level indicator that review was conducted. File-level codetags might be better suited as properties in the revision control system, but might still be appropriately specified in a codetag. Some of these are temporary (e.g., ``FIXME``) while others are persistent (e.g., ``REQ``). A mnemonic was chosen over a synonym using three criteria: descriptiveness, length (shorter is better), commonly used. Choosing between ``FIXME`` and ``XXX`` is difficult. ``XXX`` seems to be more common, but much less descriptive. Furthermore, ``XXX`` is a useful placeholder in a piece of code having a value that is unknown. Thus ``FIXME`` is the preferred spelling. `Sun says`__ that ``XXX`` and ``FIXME`` are slightly different, giving ``XXX`` higher severity. However, with decades of chaos on this topic, and too many millions of developers who won't be influenced by Sun, it is easy to rightly call them synonyms. __ http://java.sun.com/docs/codeconv/html/CodeConventions.doc9.html#395 ``DONE`` is always a completed ``TODO`` item, but this should probably be indicated through the revision control system and/or a completion recording mechanism (see `DONE File`_). It may be a useful metric to count ``NOTE`` tags: a high count may indicate a design (or other) problem. But of course the majority of codetags indicate areas of code needing some attention. An ``FAQ`` is probably more appropriately documented in a wiki where users can more easily view and contribute. Fields ------ All fields are optional. The proposed standard fields are described in this section. Note that upper case field characters are intended to be replaced. The *Originator/Assignee* and *Origination Date/Week* fields are the most common and don't usually require a prefix. .. NOTE: the colon after the prefix is a new addition that became necessary when it was pointed out that a "codename" field (with no digits) such as "cTiger" would be indistinguishable from a username. <MDE 2005-9-24> .. NOTE: This section started out with just assignee and due week. It has grown into a lot of fields by request. It is still probably best to use a tracking system for any items that deserve it, and not duplicate everything in a codetag (field). <MDE> This lengthy list of fields is liable to scare people (the intended minimalists) away from adopting codetags, but keep in mind that these only exist to support programmers who either 1) like to keep ``BUG`` or ``RFE`` codetags in a complete form, or 2) are using codetags as their complete and only tracking system. In other words, many of these fields will be used very rarely. They are gathered largely from industry-wide conventions, and example sources include `GCC Bugzilla`__ and `Python's SourceForge`__ tracking systems. .. ???: Maybe codetags inside packages (__init__.py files) could have special global significance. <MDE> __ http://gcc.gnu.org/bugzilla/ __ http://sourceforge.net/tracker/?group_id=5470 ``AAA[,BBB]...`` List of *Originator* or *Assignee* initials (the context determines which unless both should exist). It is also okay to use usernames such as ``MicahE`` instead of initials. Initials (in upper case) are the preferred form. ``a:AAA[,BBB]...`` List of *Assignee* initials. This is necessary only in (rare) cases where a codetag has both an assignee and an originator, and they are different. Otherwise the ``a:`` prefix is omitted, and context determines the intent. E.g., ``FIXME`` usually has an *Assignee*, and ``NOTE`` usually has an *Originator*, but if a ``FIXME`` was originated (and initialed) by a reviewer, then the assignee's initials would need a ``a:`` prefix. ``YYYY[-MM[-DD]]`` or ``WW[.D]w`` The *Origination Date* indicating when the comment was added, in `ISO 8601`_ format (digits and hyphens only). Or *Origination Week*, an alternative form for specifying an *Origination Date*. A day of the week can be optionally specified. The ``w`` suffix is necessary for distinguishing from a date. ``d:YYYY[-MM[-DD]]`` or ``d:WW[.D]w`` *Due Date (d)* target completion (estimate). Or *Due Week (d)*, an alternative to specifying a *Due Date*. ``p:N`` *Priority (p)* level. Range (N) is from 0..3 with 3 being the highest. 0..3 are analogous to low, medium, high, and showstopper/critical. The *Severity* field could be factored into this single number, and doing so is recommended since having both is subject to varying interpretation. The range and order should be customizable. The existence of this field is important for any tool that itemizes codetags. Thus a (customizable) default value should be supported. ``t:NNNN`` *Tracker (t)* number corresponding to associated Ticket ID in separate tracking system. The following fields are also available but expected to be less common. ``c:AAAA`` *Category (c)* indicating some specific area affected by this item. ``s:AAAA`` *Status (s)* indicating state of item. Examples are "unexplored", "understood", "inprogress", "fixed", "done", "closed". Note that when an item is completed it is probably better to remove the codetag and record it in a `DONE File`_. ``i:N`` Development cycle *Iteration (i)*. Useful for grouping codetags into completion target groups. ``r:N`` Development cycle *Release (r)*. Useful for grouping codetags into completion target groups. .. NOTE: SourceForge does not recognize a severity and I think that *Priority* (along with separate RFE codetags) should encompass and obviate *Severity*. <MDE> .. NOTE: The tools will need an ability to sort codetags in order of targeted completion. I feel that *Priority* should be a unique, lone indicator of that addressability order. Other categories such as *Severity*, *Customer Importance*, etc. are related to business logic and should not be recognized by the codetag tools. If some groups want to have such logic, then it is best factored (externally) into a single value (priority) that can determine an ordering of actionable items. <MDE> To summarize, the non-prefixed fields are initials and origination date, and the prefixed fields are: assignee (a), due (d), priority (p),tracker (t), category (c), status (s), iteration (i), and release (r). It should be possible for groups to define or add their own fields, and these should have upper case prefixes to distinguish them from the standard set. Examples of custom fields are *Operating System (O)*, *Severity (S)*, *Affected Version (A)*, *Customer (C)*, etc. DONE File --------- Some codetags have an ability to be *completed* (e.g., ``FIXME``, ``TODO``, ``BUG``). It is often important to retain completed items by recording them with a completion date stamp. Such completed items are best stored in a single location, global to a project (or maybe a package). The proposed format is most easily described by an example, say ``~/src/fooproj/DONE``:: # TODO: Recurse into subdirs only on blue # moons. <MDE 2003-09-26> [2005-09-26 Oops, I underestimated this one a bit. Should have used Warsaw's First Law!] # FIXME: ... ... You can see that the codetag is copied verbatim from the original source file. The date stamp is then entered on the following line with an optional post-mortem commentary. The entry is terminated by a blank line (``\n\n``). It may sound burdensome to have to delete codetag lines every time one gets completed. But in practice it is quite easy to setup a Vim or Emacs mapping to auto-record a codetag deletion in this format (sans the commentary). Tools ===== Currently, programmers (and sometimes analysts) typically use *grep* to generate a list of items corresponding to a single codetag. However, various hypothetical productivity tools could take advantage of a consistent codetag format. Some example tools follow. .. NOTE: Codetag tools are mostly unimplemented (but I'm getting started!) <MDE> Document Generator Possible docs: glossary, roadmap, manpages Codetag History Track (with revision control system interface) when a ``BUG`` tag (or any codetag) originated/resolved in a code section Code Statistics A project Health-O-Meter Codetag Lint Notify of invalid use of codetags, and aid in porting to codetags Story Manager/Browser An electronic means to replace XP notecards. In MVC terms, the codetag is the Model, and the Story Manager could be a graphical Viewer/Controller to do visual rearrangement, prioritization, and assignment, milestone management. Any Text Editor Used for changing, removing, adding, rearranging, recording codetags. There are some tools already in existence that take advantage of a smaller set of pseudo-codetags (see References_). There is also an example codetags implementation under way, known as the `Codetag Project`__. __ http://tracos.org/codetag Objections ========== :Objection: Extreme Programming argues that such codetags should not ever exist in code since the code is the documentation. :Defense: Maybe you should put the codetags in the unit test files instead. Besides, it's tough to generate documentation from uncommented source code. ---- :Objection: Too much existing code has not followed proposed guidelines. :Defense: [Simple] utilities (*ctlint*) could convert existing codes. ---- :Objection: Causes duplication with tracking system. :Defense: Not really, unless fields are abused. If an item exists in the tracker, a simple ticket number in the codetag tracker field is sufficient. Maybe a duplicated title would be acceptable. Furthermore, it's too burdensome to have a ticket filed for every item that pops into a developer's mind on-the-go. Additionally, the tracking system could possibly be obviated for simple or small projects that can reasonably fit the relevant data into a codetag. ---- :Objection: Codetags are ugly and clutter code. :Defense: That is a good point. But I'd still rather have such info in a single place (the source code) than various other documents, likely getting duplicated or forgotten about. The completed codetags can be sent off to the `DONE File`_, or to the bit bucket. ---- :Objection: Codetags (and allcomments) get out of date. :Defense: Not so much if other sources (externally visible documentation) depend on their being accurate. ---- :Objection: Codetags tend to only rarely have estimated completion dates of any sort. OK, the fields are optional, but you want to suggest fields that actually will be widely used. :Defense: If an item is inestimable don't bother with specifying a date field. Using tools to display items with order and/or color by due date and/or priority, it is easier to make estimates. Having your roadmap be a dynamic reflection of your codetags makes you much more likely to keep the codetags accurate. ---- :Objection: Named variables for the field parameters in the ``<>`` should be used instead of cryptic one-character prefixes. I.e., <MDE p:3> should rather be <author=MDE, priority=3>. :Defense: It is just too much typing/verbosity to spell out fields. I argue that ``p:3 i:2`` is as readable as ``priority=3, iteration=2`` and is much more likely to by typed and remembered (see bullet C in Philosophy_). In this case practicality beats purity. There are not many fields to keep track of so one letter prefixes are suitable. ---- :Objection: Synonyms should be deprecated since it is better to have a single way to spell something. :Defense: Many programmers prefer short mnemonic names, especially in comments. This is why short mnemonics were chosen as the primary names. However, others feel that an explicit spelling is less confusing and less prone to error. There will always be two camps on this subject. Thus synonyms (and complete, full spellings) should remain supported. ---- :Objection: It is cruel to use [for mnemonics] opaque acronyms and abbreviations which drop vowels; it's hard to figure these things out. On that basis I hate: MLSTN RFCTR RFE FEETCH, NYI, FR, FTRQ, FTR WKRD RVDBY :Defense: Mnemonics are preferred since they are pretty easy to remember and take up less space. If programmers didn't like dropping vowels we would be able to fit very little code on a line. The space is important for those who write comments that often fit on a single line. But when using a canon everywhere it is much less likely to get something to fit on a line. ---- :Objection: It takes too long to type the fields. :Defense: Then don't use (most or any of) them, especially if you're the only programmer. Terminating a codetag with ``<>`` is a small chore, and in doing so you enable the use of the proposed tools. Editor auto-completion of codetags is also useful: You can program your editor to stamp a template (e.g. ``# FIXME . <MDE {date}>``) with just a keystroke or two. ---- :Objection: *WorkWeek* is an obscure and uncommon time unit. :Defense: That's true but it is a highly suitable unit of granularity for estimation/targeting purposes, and it is very compact. The `ISO 8601`_ is widely understood but allows you to only specify either a specific day (restrictive) or month (broad). ---- :Objection: I aesthetically dislike for the comment to be terminated with <> in the empty field case. :Defense: It is necessary to have a terminator since codetags may be followed by non-codetag comments. Or codetags could be limited to a single line, but that's prohibitive. I can't think of any single-character terminator that is appropriate and significantly better than <>. Maybe ``@`` could be a terminator, but then most codetags will have an unnecessary @. ---- :Objection: I can't use codetags when writing HTML, or less specifically, XML. Maybe ``@fields@`` would be a better than ``<fields>`` as the delimiters. :Defense: Maybe you're right, but ``<>`` looks nicer whenever applicable. XML/SGML could use ``@`` while more common programming languages stick to ``<>``. References ========== Some other tools have approached defining/exploiting codetags. See http://tracos.org/codetag/wiki/Links. .. _wiki: http://tracos.org/codetag/wiki/Pep .. _ISO 8601: http://en.wikipedia.org/wiki/ISO_8601 .. _c2: http://c2.com/cgi/wiki?FixmeComment .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:

Micah Elliott <mde@micah.elliott.name> wrote:
``FIXME (XXX, DEBUG, BROKEN, REFACTOR, REFACT, RFCTR, OOPS, SMELL, NEEDSWORK, INSPECT)`` *Fix me*: Areas of problematic or ugly code needing refactoring or cleanup.
I think the standard should not have codetags that are synonyms. This is Python and there should be only one way to do it. One problem with synonyms is that they makes it harder to search using tools like grep. Neil

On 9/26/05, Neil Schemenauer <nas@arctrix.com> wrote:
Micah Elliott <mde@micah.elliott.name> wrote:
``FIXME (XXX, DEBUG, BROKEN, REFACTOR, REFACT, RFCTR, OOPS, SMELL, NEEDSWORK, INSPECT)`` *Fix me*: Areas of problematic or ugly code needing refactoring or cleanup.
I think the standard should not have codetags that are synonyms. This is Python and there should be only one way to do it. One problem with synonyms is that they makes it harder to search using tools like grep.
It has always been my choice to *only* use XXX. I hope there aren't any developers contributing to the Python core who use any others? I honestly don't see much of a point for distinguishing different types; these are for humans to read and review, and different tags just makes it harder to grep. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
It has always been my choice to *only* use XXX. I hope there aren't any developers contributing to the Python core who use any others?
[Python-2.4.1]$ grep FIXME */*.c */*.py | wc -l 12 [Python-2.4.1]$ grep TODO */*.c */*.py | wc -l 17 [Python-2.4.1]$ grep XXX */*.c */*.py | wc -l 525
I honestly don't see much of a point for distinguishing different types; these are for humans to read and review, and different tags just makes it harder to grep.
I tend to use FIXME for smelly code, and a collection of TODO:s at the top of the file for things that someone should work on some day... (which explains some, but not all, of the non-XXX:es above...) </F>

On Monday 26 September 2005 05:35 pm, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at <http://www.python.org/peps/pep-0350.html>.
Overall, it looks good, but my objection would be:
:Objection: I aesthetically dislike for the comment to be terminated with <> in the empty field case.
:Defense: It is necessary to have a terminator since codetags may be followed by non-codetag comments. Or codetags could be limited to a single line, but that's prohibitive. I can't think of any single-character terminator that is appropriate and significantly better than <>. Maybe ``@`` could be a terminator, but then most codetags will have an unnecessary @.
The <> terminator is evil. People will hate that. If there are no fields, you should just be able to leave it off. This will have an additional advantage in that many will already have compliant codetags if you leave off this requirement. You worry over the need to detect the end of the block, but wouldn't '\n\n' be a much more natural delimiter? I.e.: # TODO: This is a multi-line todo tag. # You see how I've gone to the next line. # This, on the other hand is an unrelated comment. You can tell it's not # related, because there is an intervening blank line. I think people # do this naturally when writing comments (I know I do -- I'm fairly # certain I've seen other people do it). # # Whereas, as you can see, a mere paragraph break can be represented by # a blank comment line. # # Whitespace formatting, after all, is VERY PYTHONIC. ;-) # Delimiters on the other hand -- well, we prefer not to mention # the sort of languages that use those, right? ;-) Another possibility is to recognize lines like: #--------------------------------------- #*************************************** #======================================= I.e. a comment mark followed by a line composed of repeating characters as an alternative separator. These are also pretty in pretty common use. Cheers, Terry -- Terry Hancock ( hancock at anansispaceworks.com ) Anansi Spaceworks http://www.anansispaceworks.com

Terry Hancock wrote:
On Monday 26 September 2005 05:35 pm, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at <http://www.python.org/peps/pep-0350.html>.
Overall, it looks good, but my objection would be:
:Objection: I aesthetically dislike for the comment to be terminated with <> in the empty field case.
:Defense: It is necessary to have a terminator since codetags may be followed by non-codetag comments. Or codetags could be limited to a single line, but that's prohibitive. I can't think of any single-character terminator that is appropriate and significantly better than <>. Maybe ``@`` could be a terminator, but then most codetags will have an unnecessary @.
The <> terminator is evil. People will hate that. If there are no fields, you should just be able to leave it off. This will have an additional advantage in that many will already have compliant codetags if you leave off this requirement.
You worry over the need to detect the end of the block, but wouldn't '\n\n' be a much more natural delimiter? I.e.:
# TODO: This is a multi-line todo tag. # You see how I've gone to the next line.
# This, on the other hand is an unrelated comment. You can tell it's not # related, because there is an intervening blank line. I think people # do this naturally when writing comments (I know I do -- I'm fairly # certain I've seen other people do it). # # Whereas, as you can see, a mere paragraph break can be represented by # a blank comment line. # # Whitespace formatting, after all, is VERY PYTHONIC. ;-) # Delimiters on the other hand -- well, we prefer not to mention # the sort of languages that use those, right? ;-)
+1
Another possibility is to recognize lines like:
#--------------------------------------- #*************************************** #=======================================
I.e. a comment mark followed by a line composed of repeating characters as an alternative separator. These are also pretty in pretty common use.
-0 regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.pycon.org

At 03:35 PM 9/26/2005 -0700, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at <http://www.python.org/peps/pep-0350.html>.
This seems a little weird to me. On the one hand, seems like a cool idea if you aren't using Eclipse or another IDE that tracks this stuff, but still need some kind of tracking system. But, if that is the case, the notation seems a little bit overkill, especially with respect to tracking who's responsible - i.e., just you. If you have a team that can agree to use the tools, I suppose it might be useful, but then I wonder, why not use something like Trac? Finally, I think there should be something besides just a comment to distinguish these things; like starting with a symbol (e.g. # !FIXME), so that that documentation extraction tools can distinguish code tags from other documentation that just happens to start with a CAPITALIZED word. Overall, I'm kind of -0.5. It seems like a spec in search of an application. The Motivation is sorely lacking - it reads like, "hey, it's optional and you can do stuff", where the stuff you can do is deferred to a later section, and is mostly stuff that could easily be done in other ways. For example, FIT-style acceptance test documents, or Python doctest files go a long way towards documenting stories in association with tests, and they don't require you to cram things into comments. (And because they're executable tests, they get kept up-to-date.) Tracking bugfixes with code history is handled nicely by tools like Trac. There are lots of Python documentation tools already. And so on. Really, it reads to me like you came up with the features to sell the format, instead of designing the format to implement specific features. My suggestion: implement some tools, use them for a while, and come back with more focused use cases to show why only this format can work, and why the Python core developers should therefore use it. I'm not saying that you can't have an informational PEP unless it should be used in the stdlib, mind you. Just pointing out that if you can't convince the core developers it's useful, I'm thinking you'll have a hard time convincing the community at large to actually use it. You need to actually have a better mousetrap to present before you ask people to move their cheese. :)

"Phillip J. Eby" <pje@telecommunity.com> wrote:
At 03:35 PM 9/26/2005 -0700, Micah Elliott wrote:
Please read/comment/vote. This circulated as a pre-PEP proposal submitted to c.l.py on August 10, but has changed quite a bit since then. I'm reposting this since it is now "Open (under consideration)" at <http://www.python.org/peps/pep-0350.html>.
This seems a little weird to me. On the one hand, seems like a cool idea if you aren't using Eclipse or another IDE that tracks this stuff, but still need some kind of tracking system. But, if that is the case, the notation seems a little bit overkill, especially with respect to tracking who's responsible - i.e., just you.
There are various Python editors which have had support for a similar style of tags for quite a while. Some allow #<anything>: <comment>, others allow # <alphanumeric and spaces> : <comment>, even others allow more or less. Some even count exclamation points as an indicator of severity. Personally, though I use tags in some of the code I write, and though the editor I use (and wrote) supports tags, I'm of the opinion that an unofficial spec is sufficient. See koders.com and search for 'fixme' to see some common variants. - Josiah

Thanks to all who have read and/or provided feedback. There have been some great ideas and admonitions that hadn't crossed my mind. I'll paraphrase some of the responses for the sake of brevity; I don't mean to misquote anyone. Tom> ISO 8601 includes a week notation. That's great. Thanks for pointing it out; seems like it should be the preferred week format. I might also want to allow a simple W42 (say) since it's so much shorter, and I'll consider myself generally in trouble if I wrap around on the year for due items. Terry> Shorter forms such as DO, FIX, DEF, DOC, etc. are better. I like the short proposals, so I'll add, and possibly even canonize them. My proposed canons were based on popular use and also my taste. I had originally wanted to state that only the canonical forms would be supported and use of the synonyms should be deprecated. That would have simplified things a bit (human and tool parsing). But based on people's ideas about what is canonical, that would never have flown. Instead, it seems useful to just list everything that's ever been used as long as it *fits one of the categories*. And the canon is mostly arbitrary/unnecessary; we'd never settle that peacefully anyway. The main point is that official categorization enables construction of productivity tools. Terry> IDEXXX isn't vim/emacs. Offer an x:"comment" field for a Terry> completed item. Bengt> Later a tool can strip this out to the devlog.txt or DONE Bengt> file, when the tool sees an added progress line like Bengt> # ---: woohoo, completed ;-) <WHO 2005-10-11 04:56:12> I wish we could rely on everyone to have/use cron. These are both great ideas. I'd like to allow/have both. Bengt> 7) A way of appending an incremental progress line to an existing code Bengt> tag line, e.g., Bengt> # FIXME: This will take a while: rework foo and bar <WHO 2005-09-25> Bengt> # ...: test_foo for new foo works! <WHO 2005-09-26 01:23:45-0700> Bengt> # ...: vacation <WHO 2005-10-01 d:2005-10-08> Status updates? Nice!! Great syntax too. Bengt> time, embedded in strings, scope, no DONE, same line as code... Your pennies are gold! Thanks! Another thing that came to mind recently: As with docstrings, the first sentence of a multiline codetags should be a meaningful summary. So for multiline codetags I'll suggest that the fist line be something that could show up in say a manpage or a BUGFIX file. Terry> Terminator <> is evil. Whitespace is good. Bruno> Or if the codetag is immediately followed by code it's Bruno> terminated. Yes, I'd actually forgotten that it is also not-equal! And I agree that \n\n (or code) is a good terminator. I had been in the practice of putting some TODOs together near the tops of my modules, but a white line between would probably look cleaner anyway. Phillip> there should be something besides just a comment to Phillip> distinguish these things; like starting with a symbol (e.g. Phillip> # !FIXME), so that that documentation extraction tools can Phillip> distinguish code tags from other documentation that just Phillip> happens to start with a CAPITALIZED word. That might be necessary. But if the extraction tools are aware of all official codetags, then it becomes less important. It might even be useful for lint tools to comment when seeing a line that begins with say "# FOO:" but isn't a codetag. Most such uses probably fall under one of the proposed categories anyway. pythondev> It has always been my choice to *only* use XXX. I hope there pythondev> aren't any developers contributing to the Python core who pythondev> use any others? $ csrcs=$(find ~/archive/Python-2.4.1 -name *.c) $ for tag in '!!!' '\?\?\?' XXX WARN CAUTION \ TBD FIXME TODO BUG HACK Note NOTE RFE IMPORTANT; do echo -n "$tag: "; egrep"\b$tag" $csrcs |wc -l done !!!: ~1 \?\?\?: ~12 [most of these part of XXXs] XXX: 365 WARN: ~4 CAUTION: 16 TBD: ~2 FIXME: 12 TODO: 12 BUG: 0 HACK: 0 Note: ~306 NOTE: ~9 RFE: 0 IMPORTANT: ~6 [some overlap with NOTEs] I subtracted most of the false positives but I think the model is being implicitly used to a small degree already. It's just hard to know that they're in the code. I'm impressed there are so few in 365 KLOC. I also notice some WHO: initials, as well as Hmmm:, bad:, Reference:, Obscure:, NB:, Bah:, and others. pythondev> I honestly don't see much of a point for pythondev> distinguishing different types; these are for humans to read pythondev> and review, and different tags just makes it harder to grep. Yes, they are hard to grep! And so are XXXs if multi-line. You'd have to do something like "$EDITOR `grep -l XXX $csrcs`" to really grok them. That's why a different tool is needed for inspection. Even if the codetag paradigm is despised for a given project, something (pychecker/pylint) needs to do a proper scan to address/remove/alert them. I won't argue that the interpreter should adopt codetags, but it would at least benefit from lint recognition. Phillip> You still need a tracking system. Agreed, for most projects, and I think Trac is great. But some might want to use codetags as a way to track anything that is not a full-blown bug. And how many times have you seen small projects that say, "We don't have a bug tracker yet. Please just send bugs to <bugs@anothersmallproject.org>"? Josiah> Some even count exclamation points as an indicator of severity. Michael> I prefer TODO SMELL STINK STENCH VOMIT to indicate TODO priority. These seem useful. But I personally prefer a single TODO with a numeric priority, and I feel that the p: field is the simplest way to get that (easier to remember numbers 0..3 than contaminations or !-counts). I think the example you gave could be done with a "# FIXME: <p:2>..." and still be obvious, or even "# FIXME: <p:Stench>...", assuming you have mapped your bletcherosity level to numbers. This also assumes Terry's whitespace idea is used so the fields could show up at the front. Note that the PEP has separated TODO from FIXME semantics. Josiah> an unofficial spec is sufficient. See koders.com and search Josiah> for 'fixme' to see some common variants. But that's the problem -- there are already a bunch of "unofficial" specs, which don't serve much purpose as such. It's a cool site. I spent some time browsing and I do see a lot of codetags in use (many thousands in Python code alone; I'm not sure if the number represented strings or files). But I'll argue that this just reinforces the need for an *official* spec/guideline, so that the tools can do something with them. Paul> Such a PEP should not be approved unless there's Paul> already an implementation (e.g. PyChecker patch) Phillip> implement some tools, use them for a while, and come back Phillip> with more focused use cases Phillip> It seems like a spec in search of an application. The Phillip> Motivation is sorely lacking My two main motivations are avoiding duplication (for documentation) and organizing tasks. I'm presently using it on a smallish project (5 KLOC) to generate manpage sections (fed to help2man) for BUGS, GLOSSARY, and RFE. These should arguably be dumped to a BUGS/BUGFIX/ChangeLog file. I haven't yet figured out how to make Trac or SourceForge do a nice creation of such a file, though it's essential IMO to provide one with a source package. BUGS files are also non-standardized, though I've seen some pretty nice (yet wildly different) ones, and a tool could help here. The other current use (for me) is as a grep replacement. The tools (just ctdoc right now) are limited (pre-alpha) but really help me address the right tasks in the right order. See <http://tracos.org/codetag/wiki/ScreenShots> for a little comparison to grepping. I do think that the health-o-meter is also valuable (I see that pylint already does a nice job of this). I agree that proof of value is necessary. Without a spec though it will be hard to get people to know about a convention/toolset, so it's a bit of a chicken-egg problem -- I can't have a pep until the tools are in use, but the tools won't be used until programmers have means/motivation to use them, a pep. But now that I have your feedback/ideas I (and maybe the lint folks) can do better job of expanding flexible tools that can prove this paradigm useful (or useless). I will continue development on the tools and encourage anyone interested in using a standard set of codetags for documentation and tracking purposes to give them a try (and provide more feedback!) as they mature. -- Micah Elliott

At 09:10 AM 9/28/2005 -0700, Micah Elliott wrote:
I agree that proof of value is necessary. Without a spec though it will be hard to get people to know about a convention/toolset, so it's a bit of a chicken-egg problem -- I can't have a pep until the tools are in use, but the tools won't be used until programmers have means/motivation to use them, a pep.
My point about the lack of motivation was that there was little reason shown why this should be a PEP instead of either: 1. Documentation for a specific tool, or group of tools 2. A specific project's process documentation Are you proposing that this format be used by the Python developers for Python itself? A process spec like this seems orthogonal to Python-the-language. To put it another way, this seems like writing a PEP on how to do eXtreme Programming, or perhaps a PEP on how the blogging "trackback" protocol works. Certainly you might implement those things using Python, but the spec itself seems entirely orthogonal to Python. I don't really see why it's a PEP, as opposed to just a published spec on your own website, unless you intend for say, the Python stdlib to conform to it.

On 9/29/05, Phillip J. Eby <pje@telecommunity.com> wrote:
My point about the lack of motivation was that there was little reason shown why this should be a PEP instead of either:
1. Documentation for a specific tool, or group of tools 2. A specific project's process documentation
That's what I feel as well. I hadn't commented on the PEP as I had simply intended to ignore it totally in my own projects... Paul.

Micah Elliott <mde@tracos.org> wrote:
Josiah> an unofficial spec is sufficient. See koders.com and search Josiah> for 'fixme' to see some common variants.
But that's the problem -- there are already a bunch of "unofficial" specs, which don't serve much purpose as such. It's a cool site. I spent some time browsing and I do see a lot of codetags in use (many thousands in Python code alone; I'm not sure if the number represented strings or files). But I'll argue that this just reinforces the need for an *official* spec/guideline, so that the tools can do something with them.
Defining a spec for code tags doesn't mean that people will start using them. Why? Because it is a documentation spec. From my experience, documentation specs are only adhered to by the organizations (companies, groups, etc.) which the code is produced by and for, and they generally define the code specs for their organization. Further, even if it becomes a spec, it doesn't guarantee implementation in Python editors (for which you are shooting for). Take a wander through current implementations of code tags in various editors to get a feel for what they support. I've read various posts about what code tags could support, but not what editors which implement code tags and/or its variants actually currently support; which is a better indication of what people want than an informal survey via email of python-dev, python-list, and/or the PEP submission process. - Josiah

Josiah Carlson wrote:
Further, even if it becomes a spec, it doesn't guarantee implementation in Python editors (for which you are shooting for). Take a wander through current implementations of code tags in various editors to get a feel for what they support. I've read various posts about what code tags could support, but not what editors which implement code tags and/or its variants actually currently support; which is a better indication of what people want than an informal survey via email of python-dev, python-list, and/or the PEP submission process.
An approach to this area that would make sense to me is: 1. Defer PEP 350 2. Publish a simple Python module for finding and processing code tags in a configurable fashion 3. Include a default configuration in the module that provides the behaviour described in PEP 350 4. After this hypothetical code tag processing module has been out in the wild for a while, re-open PEP 350 with an eye to including the module in the standard library The idea is that it should be possible to tailor the processing module in order to textually scan a codebase (possibly C or C++ rather than Python) in accordance with a project-specific system of code tagging, rather than requiring that the project necessarily use the default style included in the processing module (Although using a system other than the default one may result in reduced functionality, naturally). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com

On 9/30/05, Nick Coghlan <ncoghlan@gmail.com> wrote:
An approach to this area that would make sense to me is:
1. Defer PEP 350 2. Publish a simple Python module for finding and processing code tags in a configurable fashion 3. Include a default configuration in the module that provides the behaviour described in PEP 350 4. After this hypothetical code tag processing module has been out in the wild for a while, re-open PEP 350 with an eye to including the module in the standard library
The idea is that it should be possible to tailor the processing module in order to textually scan a codebase (possibly C or C++ rather than Python) in accordance with a project-specific system of code tagging, rather than requiring that the project necessarily use the default style included in the processing module (Although using a system other than the default one may result in reduced functionality, naturally).
Maybe I'm just an old fart, but this all seems way over-engineered. Even for projects the size of Python, a simple grep+find is sufficient. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
On 9/30/05, Nick Coghlan <ncoghlan@gmail.com> wrote:
An approach to this area that would make sense to me is:
1. Defer PEP 350 2. Publish a simple Python module for finding and processing code tags in a configurable fashion 3. Include a default configuration in the module that provides the behaviour described in PEP 350 4. After this hypothetical code tag processing module has been out in the wild for a while, re-open PEP 350 with an eye to including the module in the standard library
The idea is that it should be possible to tailor the processing module in order to textually scan a codebase (possibly C or C++ rather than Python) in accordance with a project-specific system of code tagging, rather than requiring that the project necessarily use the default style included in the processing module (Although using a system other than the default one may result in reduced functionality, naturally).
Maybe I'm just an old fart, but this all seems way over-engineered.
Even for projects the size of Python, a simple grep+find is sufficient.
I expect many people would agree with you, but Micah was interested enough in the area to write a PEP about it. The above was just a suggestion for a different way of looking at the problem, so that writing a PEP would actually make sense. At the moment, if the tags used are project-specific, and the method used to find them is a simple grep+find, then I don't see a reason for the idea to be a *Python* Enhancement Proposal. Further, I see some interesting possibilities for automation if such a library exists. For example, a cron job that scans the checked in sources, and automatically converts new TODO's to RFE's in the project tracker, and adds a tracker cross-link into the source code comment. The job could similarly create bug reports for FIXME's. If the project tracker was one that supported URL links, and the project had a URL view of the source tree, then the cross-links between the code tag and the tracker could be actual URL references to each other. However, the starting point for exploring any such ideas would be a library that made it easier to work with code tags. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com

On 9/30/05, Nick Coghlan <ncoghlan@gmail.com> wrote:
Further, I see some interesting possibilities for automation if such a library exists. For example, a cron job that scans the checked in sources, and automatically converts new TODO's to RFE's in the project tracker, and adds a tracker cross-link into the source code comment. The job could similarly create bug reports for FIXME's. If the project tracker was one that supported URL links, and the project had a URL view of the source tree, then the cross-links between the code tag and the tracker could be actual URL references to each other.
With all respect for the OP, that's exactly the kind of enthusiastic over-engineering that I'm afraid the PEP will encourage. I seriously doubt that any of that work will contribute towards a project's success (compared to simply having a convention of putting XXX in the code). -- --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (11)
-
Fredrik Lundh
-
Guido van Rossum
-
Josiah Carlson
-
Micah Elliott
-
Micah Elliott
-
Neil Schemenauer
-
Nick Coghlan
-
Paul Moore
-
Phillip J. Eby
-
Steve Holden
-
Terry Hancock