[Doc-SIG] Field lists and label blocks

Tony J Ibbs (Tibs) tony@lsl.co.uk
Thu, 9 Aug 2001 10:31:55 +0100

Markup versus meaning
Almost all of reST is dealing with markup - presentation. This is even
true for the hyperlink stuff - allowing one to click on a link is
*really* a presentation issue.

The exceptions are directives and field lists. Directives, being a
general escape mechanism, are possibly *anything* (so I'll ignore them
here!). Field lists are the one example of attempting to allow "data
mining" within a document stream, by embedding *meaningful* tags such as
Author or Version.

I would contend that we could produce a perfectly reasonable first
version of reST with just the markup elements. Directives seem to be
useful, and the people involved all have affection for them, so they're
in too. But field lists are a different kettle of fish. They're for a
different purpose, and require a separate justification.

Of course, historically I have argued *for* such a purpose, but I
wouldn't particularly mind making them mode dependent - I'll argue this

Field lists
In reStructuredText.txt, David describes Field lists. As I understand
it, the basic concept is to allow the user to enter text of the form

    <name>: <body>

(the actual syntax being undecided as yet).

The name *will* be RFC822 compliant (which is good).

All of the examples show the <body> as being confined to the rest of the
line - that is, no line breaks allowed. Proposed use examples are
basically those one might expect in email (and thus also PEPs), and some
attempt to allow specifying things like authors, copyright, etc., within
a document.

Label blocks
In http://www.tibsnjoan.co.uk/docutils/fat.html, I describe label blocks
(sorry - the document doesn't have internal labelling, so I can't give a
direct link).

Label blocks are defined as::

    <name>: <body>

where <name> is hand-defined as a case-insensitive Python identifier in
which hyphens are also allowed (is that RFC822?).

Label blocks have some implementation details, though:

1. A paragraph is only a label block if <name> is found
   in a dictionary of valid names, otherwise it's just
   a normal paragraph (maybe with a Warning being issued).

2. The "definition" associated with a name indicates
   whether <body> is restricted to the rest-of-line,
   or whether it may be longer (technically, whether
   its DOM tree node may have children). Specific
   examples would be::

        Author: Guido van Rossum

            * Guido van Rossum
            * Tim Peters

3. The "definition" also indicates which forms of DOM
   node the children may be - for instance, bullet list,
   paragraph, etc.

In an reST context, I would replace the single colon by two, and I would
add a rule that if children are allowed and present, then a blank line
is not allowed after the <name> - i.e., as in cells within a table, the
<name> line prepares one for a new block.

When do we want these ideas?
Field lists and label blocks are both about being able to extract
*meaning* from a document stream, rather than just presentation.

I would contend that they do not, thus, belong in the base document
format at all, but in the particular modes where they are useful. Note
that making this decision means we can get reST *itself* out the door
faster, that we get a simpler reST document (a Good Thing), and that the
decisions about how to do such things can be application dependent, and
thus easier to design and reach agreement on.

Let's consider proposed modes for a minute:

* Email - clearly field lists map well to the initial headers of an
email. In fact, I think one should say that for email mode, one adopts
the *rules* for the headers for an email, as they are presented in the
relevant RFCs.

* PEPs - the start of a PEP is a series of field lists, and they look as
if they are modelled heavily on email headers. I would suggest the PEP
headers be supported exactly as they are defined in the PEP

* HTML and document preparation modes - I'm not at all convinced of the
need for any such construct here. Certainly an HTML document doesn't
have such specialised items (oh, there are things like <address>, but
they aren't really the same). If one wants to do equivalent in HTML, one
needs to look at <meta> tags. And if one wants to write LaTeX or
somesuch, then there is no need for field lists as proposed.

* Python - well, this is the meat of it. I would contend (see below!)
that field lists are not adequate to do useful things in Python
docstrings. I would either leave them out entirely (*not* necessarily a
bad idea), or else provide something more powerful.

If we make field lists or whatever mode dependent, then it is only in
the Python case that we have any decisions to make. Life gets simpler.

Why field lists aren't enough
Field lists attack the problem of single line entities, such as
"Author". But they don't adapt well to (obvious) extensions like
"Authors", nor do they provide leverage for things like "History" where
there may be multiple paragraphs (and I, for one, would also like to be
able to do things like "Dedication").

I think that a reformed label block *would* work, allowing examples

    Author:: Tony Ibbs

        * David Goodger
        * Garth T. Kidd

        In memory of my father.

        He would have appreciated the joke that I
        would have gotten a lot further with this
        a lot sooner if he hadn't died.

        ``fred`` is the first argument.

        ``jim`` is the second.

Are we overloading "::" too much? Well, it's use in directives is
justified partially on the grounds that we've already got it to hand
(!). And if someone mistakenly inserts a blank line after a <name> line,
then they get a literal block, which is a fairly harmless error.

Enough already,

Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)