[Doc-SIG] field syntax (was re: lists & blank lines)
Edward D. Loper
edloper@gradient.cis.upenn.edu
Tue, 17 Apr 2001 12:27:28 EDT
> I think you need to define your concept of "fields" better for us
> here on the SIG (note: assume no previous knowledge of
> JavaDoc). Give a detailed example. Why isn't position significant?
> What about field order? Sounds like you're describing a
> dictionary-like structure associated with each docstring. Can a
> field be used more than once, or must each field be unique per
> docstring?
Sorry, you're right. If you have time, you can look at the JavaDoc
home page:
<http://java.sun.com/j2se/javadoc/index.html>
or at a sample of the output of JavaDoc:
<http://java.sun.com/products/jdk/1.2/docs/api/index.html>
>From the JavaDoc page:
A doc comment is made up of two parts -- a description followed by
zero or more tags, with a blank line (containing a single asterisk
"*") between these two sections:
/**
* This is the description part of a doc comment
*
* @tag Comment for the tag
*/
The first part describes the object being documented; the second part
essentially sets up a multi-map from keys to formatted doc strings.
- Certain tags are paramatrized, such as "@param", which takes a
parameter, and gives a description of it.
- Some tags can be repeated (e.g., "@see"); others can't (e.g.,
you can't have 2 "@param"'s with the same parameter.
- It is assumed that when these "fields" (=tag+value) are output,
they will be put in special sections. (see thesample output
of JavaDoc).
An example of a formatted doc string with a field (from the formatted
doc string parser I've been writing) is::
def _tokenize_literal(lines, start, block_indent, tokens, warnings):
"""
Construct a C{Token} containing the literal block starting at
C{lines[start]}, and append it to C{tokens}. C{block_indent}
should be the indentation of the literal block. Any warnings
generated while tokenizing the literal block will be appended to
C{warnings}.
@param lines: The list of lines to be tokenized.
@param start: The index into C{lines} of the first line of the
literal block to be tokenized.
@param block_indent: The indentation of C{lines[start]}. This is
the indentation of the literal block.
@param warnings: A list of the warnings generated by parsing.
Any new warnings generated while tokenizing this literal
block will be appended to this list.
@return: The line number of the first line following the literal
block.
@type lines: C{list} of C{string}
@type start: C{int}
@type block_indent: C{int}
@type warnings: C{list} of C{ParseError}
@rtype: C{int}
"""
It doesn't matter to me what syntax we use. Another alternative
that's been suggested is to do something like::
...
Arguments:
lines -- The list of lines to be tokenized.
start -- The index into C{lines} of the first line of the
literal block to be tokenized.
block_indent -- The indentation of C{lines[start]}. This
is the indentation of the literal block.
warnings -- A list of the warnings generated by parsing.
Any new warnings generated while tokenizing this
literal block will be appended to this list.
return -- The line number of the first line following the
literal block.
...
But semantically, the idea is to associate a description with each of
a number of pre-defined entities, such as the parameters of a method.
Tags defined by Javadoc are:
@see (a single see-also link; can repeat)
@author (an author; can repeat)
@version (the object's version)
@param (a function/method param; takes an argument)
@return (the return value of a function/method)
@exception (a description of an exception that a function/
method can raise; takes an argument (the exception))
@since (minimum version needed to use it)
@depreciated (object is depreciated; description of why)
I think there are a few more, but that's probably a representative
sample..
I find that the output you can produce with fields is easier to
read/use than the output you can produce without them. (See the HTML
and LaTeX versions of the Java library API)..
Of course, we don't really *need* them. In my mind, the only
necessary features for a formatted docstring language are:
- paragraphs
- literal blocks
- maybe doctest blocks
But I'd like to see them included. Of course, you don't have to use
them if you don't want to. But I think that most people will find
them useful if they try using them..
-Edward