Pre-PEP Proposal: Codetags
mde at micah.elliott.name
Thu Aug 11 08:05:41 CEST 2005
I also have this living as a wiki <http://tracos.org/codetag/wiki/Pep>
if people would like to add comments there. I might try to capture there
feedback from this group anyway. First try at a PEP -- thanks for any
Please add **NOTE:** comments to the bottom of this wiki document
Codetag PEP (*or* Tao of Codetagging)
Author: Micah Elliott <mde at tracos.org>
This informational PEP aims to provide guidelines for consistent use
of Codetags, which would enable the construction of standard utilities
to take advantage of the Codetag information, as well as making Python
code more uniform across projects. Codetag is a also a very
lightweight programming micro-paradigm and becomes useful for project
managment, documentation, change tracking, and project health
monitoring. This is submitted as a PEP because I feel its ideas are
Pythonic, although the concepts are not unique to Python programming.
Herein are the definition of a Codetag, a philosophy rant, a
motivation for standardized conventions, a specification, a toolset
description, and possible objections to the Codetag project/paradigm.
What's a Codetag?
Programmers widely use ad-hoc code comment markup conventions to serve
as reminders of sections of code that need closer inspection or
review. Examples of markup include ``FIXME``, ``TODO``, ``XXX``,
``BUG``, but there many more in wide use in existing software. Such
markup will be henceforth referred to as a *Codetag*. These Codetags
may show up in application code, unit tests, scripts, general
documentation, or wherever suitable.
NOTE: **I'm not certain Philosophy_ belongs in the PEP, but it
somewhat explains the usefulness of Codetags** <mde>
If you subscribe to most of these values, then Codetags will likely be
useful for you.
1. As much information as possible should be contained **inside the
source code** (application code or unit tests). This along with use
of Codetags impedes duplication. Most documentation can be
generated from that source code Eg, by using help2man, man2html,
docutils, epydoc/pydoc, ctdocgen, etc.
2. Information should be almost **never duplicated** -- it should be
recorded in a single original format and all other locations should
be automatically generated from the original, or simply be
referenced. This is the *SPOT* rule.
3. Documentation that gets into customers' hands should be
**auto-generated** from single sources into whatever output formats.
People want documentation in many forms. It is thus important to
have a documentation system that can generate all of these.
4. Whatever information is subject to (and suited for) user
feedback/input should be contained in a **wiki** (or maybe usenet or
maillists). Eg, FAQ, RFC, PEP.
5. There should not be a dedicated, disjoint **documentation team**
for any non-huge project. The developers writing the code know the
code best, and should be the ones to describe it.
6. **Plain text** (with non-invasive markup) is the best form of writing
anything. All other formats are to be generated from the plain
7. **Revision control** should be used for almost everything. And
modifications should be checkin'd at least daily.
**Various productivity tools can be built around Codetags.**
See `Toolset Possibilities`_.
Historically, a subset of these Codetags has been used informally in
the majority of codes in existence, whether Python or some other
language. Tags have been used in an inconsistent manner with
different spellings, semantics, format, and placement. Eg, some
programmers might include datestamps and/or user identifiers, limit to
a single line or not, spell the Codetag differently than others, etc.
**Encourages adherence to SPOT/DRY principle.**
Eg, generating a roadmap dynamically from Codetags instead of keeping
TODOs in sync with separate roadmap document.
**Easy to remember.**
All Codetags must be concise, intuitive, and semantically
non-overlapping with others. Format is also simple.
**Use not required/imposed.**
If you don't use Codetags already, there's no obligation to start, and
no risk of affecting code (but see Objections_). A small subset can be
adopted and the Tools_ will still be useful (a few are already
implicitly adopted anyway). Also very easy to identify and remove if a
Codetag is no longer deemed useful. Then it is effectively *completed*
and recorded by revision control simply by checkin'ing.
**Gives a global view of code.**
Use tools to generate documentation and reports.
**A logical location for capturing CRCs/Stories/Requirements.**
The XP community often does not electronically capture Stories, but
Codetags seem like a good place to locate them.
**Extremely lightweight process.**
Creating tickets in a tracking system for every thought degrades
development velocity. Even if a ticketing system is employed, Codetags
are useful for simply containing links to those tickets.
This shows a simple Codetag as commonly found in sources everywhere
(with the addition of a trailing ``<>``).
# FIXME: Seems like this loop should be finite. <>
while True: ...
This contrived example demonstrates more common use of Codetagging.
It uses some of the available fields to specify the owners (a pair of
developers with initials *mde* and *cle*), the Work Week of expected
completion (*w14*), and the priority of the item (*p2*).
# FIXME: Seems like this loop should be finite. <mde,cle w14 p2>
This describes the format: parsing layout, mnemonic names, fields,
Each Codetag should be inside a comment, and can be any number of lines.
It should match the indentation of surrounding code. The end of the
Codetag is marked by a ``<...>``, which must not be split onto
There are multiple fields per Codetag, all of which are optional.
To be succinct, a Codetag is a mnemonic, a colon, a commentary, an
opening broket, a list of optional fields, and a closing broket. Ie,::
# MNEMONIC: Some (maybe multi-line) commentary. <field field ...>
.. FIXME: Add completion vs target date?? <mde>
The Codetags of interest (``recommended mnemonic (synonym list)``, *canon*,
semantics, and **NOTEs**) are as follows.
Some of these are temporary (eg, ``FIXME``) while others are
persistent (eg, ``REQ``). Synonyms should probably be deprecated in
the interest of minimalism and consistency. I chose a mnemonic over a
synonym for three criteria: descriptive, short, common usage trends.
``TODO (TBD, MLSTN, DONE)``
*To Do*, An informal task/feature that is pending completion.
Relevant to roadmap.
NOTE: **DONE would really be a completed TODO item, but these
should probably be done through the revision control system.** <mde>
``FIXME (XXX, DEBUG, BROKEN, RFCTR, OOPS, SMELL)``
*Fix Me*, Problematic or ugly code. Needs refactoring or cleanup.
NOTE: **Choosing between FIXME and XXX is difficult. AFAICT XXX is
more common, but so much less descriptive. Furthermore, XXX is a
useful placeholder in a piece of code having a value that is
unknown. Sun says that XXX and FIXME are slightly different,
giving XXX higher severity.** <mde>
*Requirement*, Satisfaction of a specific, formal requirement.
``RFE (FEETCH, NYI, FR, FTRQ, FTR)``
*Request For Enhancement*, A roadmap item not yet implemented.
*Idea*, Possible ``RFE`` candidate, but less formal than ``RFE``.
``??? (QUEST, WTF, TBD, QSTN)``
*Question*, Misunderstood detail. Product of coincidental programming.
*Hack*, Temporary code to force inflexible functionality, or
simply a test change, or workaround a known problem.
*Portability*, Workaround specific to OS, Python version, etc.
*Bug*, Reported defect tracked in bug database.
*Note*, Implementation detail that stands out as non-intuitive. Or
a code reviewer found something that needs discussion or further
NOTE: **Maybe a useful metric where a high count of NOTEs indicates
a problem.** <mde>
*Frequently Asked Question*, Interesting area that requires
NOTE: **This is probably more appropriately documented in a wiki
where users can more easily contribute.** <mde>
*Glossary*, Item definition for project glossary.
*Status*, File-level statistical indicator of work needing done on this
*Reviewed By*, File-level indicator of programmer(s) who performed
recent code review.
*Reference*, Pointer to other code, web link, etc.
NOTE: **File-level Codetags might be better suited as properties in the
revision control system.** <mde>
All fields are optional. It should be possible for groups to
define/add their own, but the proposed standard fields are as follows:
Workweek target completion (estimation). Origination and
completion are freebees with revision control.
List of initials of owners of completion responsibility. There
should be no digits for initials.
Currently, programmers (and sometimes analysts) typically use *grep*
to generate a list of items corresponding to a single Codetag. However,
various hypothetical productivity tools could take advantage of a
consistent Codetag format. Some example tools follow.
NOTE: Codetag tools are mostly unimplemented (but I'm getting started!) <mde>
Possible docs: glossary, roadmap, manpages
Track (with revision control system interface) when a BUGtag (or any codetag)
originated/resolved in a code section
A project Health-O-Meter
Notify of invalid use of Codetags, and aid in porting to Codetag
An electronic means to replace XP notecards. In MVC terms, the
Codetag is the Model, and the Story Manager could be a graphical
Viewer/Controller to do visual rearrangment, prioritization, and
assignment, milestone management.
Any Text Editor
Used for changing, removing, adding, rearranging Codetags.
There are some tools already in existence that take advantage of a
smaller set of pseudo-Codetags (see References_)
**Objection**: Extreme Programming argues that such Codetags should not ever
exist in code since the code is the documentation.
**Defense**: Maybe put the Codetags in the unit test files instead.
Besides, it's tough to generate documentation from uncommented source
**Objection**: Too much existing code has not followed proposed
**Defense**: [Simple] utilities (*ctlint*) could convert existing
**Objection**: Causes duplication with tracking system.
**Defense**: Not really -- If an item exists in the tracker, a simple
ticket number as the Codetag commentary is sufficent. Maybe a
duplicated title would be acceptable. Furthermore, it's too
burdensome to have a ticket filed for every item that pops into a
developer mind on-the-go.
**Objection**: Codetags are ugly and clutter code.
**Defense**: That is a good point. But I'd still rather have such info
in a single place (the source code) than various other documents,
likely getting duplicated or forgotten about.
**Objection**: Codetags (and all comments) get out of date.
**Defense**: Not so much if other sources (externally visible
documentation) depend on them being accurate.
Some other tools have approached defining/exploiting Codetags.
Please add comments below following line. Or feel free to comment
inline in above sections with **NOTE:** Codetags. Objections_ might be
a popular area for comments. :-)
More information about the Python-list