[Doc-SIG] field syntax (was re: lists & blank lines)

Edward D. Loper edloper@gradient.cis.upenn.edu
Tue, 17 Apr 2001 12:27:28 EDT


> I think you need to define your concept of "fields" better for us
> here on the SIG (note: assume no previous knowledge of
> JavaDoc). Give a detailed example. Why isn't position significant?
> What about field order? Sounds like you're describing a
> dictionary-like structure associated with each docstring. Can a
> field be used more than once, or must each field be unique per
> docstring?

Sorry, you're right.  If you have time, you can look at the JavaDoc
home page:

  <http://java.sun.com/j2se/javadoc/index.html>

or at a sample of the output of JavaDoc:

  <http://java.sun.com/products/jdk/1.2/docs/api/index.html>

>From the JavaDoc page:

    A doc comment is made up of two parts -- a description followed by
    zero or more tags, with a blank line (containing a single asterisk
    "*") between these two sections:

    /** 
     * This is the description part of a doc comment
     *
     * @tag    Comment for the tag
     */

The first part describes the object being documented; the second part
essentially sets up a multi-map from keys to formatted doc strings.

  - Certain tags are paramatrized, such as "@param", which takes a
    parameter, and gives a description of it.
  - Some tags can be repeated (e.g., "@see"); others can't (e.g.,
    you can't have 2 "@param"'s with the same parameter.
  - It is assumed that when these "fields" (=tag+value) are output,
    they will be put in special sections.  (see thesample output
    of JavaDoc).

An example of a formatted doc string with a field (from the formatted
doc string parser I've been writing) is::

    def _tokenize_literal(lines, start, block_indent, tokens, warnings):
        """
        Construct a C{Token} containing the literal block starting at
        C{lines[start]}, and append it to C{tokens}.  C{block_indent}
        should be the indentation of the literal block.  Any warnings
        generated while tokenizing the literal block will be appended to
        C{warnings}.

        @param lines: The list of lines to be tokenized.
        @param start: The index into C{lines} of the first line of the
            literal block to be tokenized.
        @param block_indent: The indentation of C{lines[start]}.  This is
            the indentation of the literal block.
        @param warnings: A list of the warnings generated by parsing.  
            Any new warnings generated while tokenizing this literal
            block will be appended to this list.
        @return: The line number of the first line following the literal
            block.

        @type lines: C{list} of C{string}
        @type start: C{int}
        @type block_indent: C{int}
        @type warnings: C{list} of C{ParseError}
        @rtype: C{int}
        """

It doesn't matter to me what syntax we use.  Another alternative
that's been suggested is to do something like::
        ...
        Arguments:
            lines -- The list of lines to be tokenized.
            start -- The index into C{lines} of the first line of the
                literal block to be tokenized.
            block_indent -- The indentation of C{lines[start]}.  This
                is the indentation of the literal block.
            warnings -- A list of the warnings generated by parsing.
                Any new warnings generated while tokenizing this
                literal block will be appended to this list.
            return -- The line number of the first line following the
                literal block.
        ...

But semantically, the idea is to associate a description with each of
a number of pre-defined entities, such as the parameters of a method.
Tags defined by Javadoc are:
    @see          (a single see-also link; can repeat)
    @author       (an author; can repeat)
    @version      (the object's version)
    @param        (a function/method param; takes an argument)
    @return       (the return value of a function/method)
    @exception    (a description of an exception that a function/
                   method can raise; takes an argument (the exception))
    @since        (minimum version needed to use it)
    @depreciated  (object is depreciated; description of why)

I think there are a few more, but that's probably a representative
sample..

I find that the output you can produce with fields is easier to
read/use than the output you can produce without them.  (See the HTML
and LaTeX versions of the Java library API)..  

Of course, we don't really *need* them.  In my mind, the only
necessary features for a formatted docstring language are:
  - paragraphs
  - literal blocks
  - maybe doctest blocks

But I'd like to see them included.  Of course, you don't have to use
them if you don't want to.  But I think that most people will find
them useful if they try using them..

-Edward