[Doc-SIG] RE: directives and fields
Goodger, David
dgoodger@atsautomation.com
Thu, 19 Apr 2001 11:22:52 -0400
[Edward D. Loper]
> Well at least there should be rules in the "generic parser" that say
> when directives end, so that a parser can ignore a directive if it
> doesn't understand it. As I understood your original proposal,
> directives ended with blank lines. I think that they should end with
> a dedent back to the indent they started at, because then they can
> include blank lines..
>From the reStructuredText spec, first draft:
"""
A comment/directive block is a text block:
- whose first line begins with '.. ' in column 1,
- whose second and subsequent lines are indented relative to the first, and
- which ends with a blank or unindented line.
...
Actions taken in response to directives and the
interpretation of data in the directive block or subsequent text block(s)
are directive- and implementation-dependent.
"""
I would only change the third list item to 'which ends with an unindented
line'.
> And I think that it should be *possible* to handle directives in a
> second pass.
Sure, if that's what the extension wants to do. The extension itself is
called during parsing. If it's tied to a post-parse process, that's its own
business.
There are essentially two types of directives: extensions, which apply to
their blocks only; and plugins, which may change the behaviour of the parser
for some defined part of the input (may be for the adjacent text block, may
be globally). Justification for plugins: it would be useful to modify the
parser's behaviour on the fly, without having to subclass. For example, a
'fields' plugin could add support for the '@' syntax, allowing
experimentation & testing. Kind of like the 'from __future__ import' hack.
;->
> I.e., I don't think we should have any directives that
> change the syntax of subsequent parts of the string, like::
>
> This is *emph*
>
> .. switch-emph-and-literal
>
> This is *literal*
Of course, no such directive would be part of the standard package. Only a
lunatic would play games like this. But it would be a great way for people
to play with alternate syntax.
> Basically, it seems like you should be able to make a "generic" parser
> which outputs a DOM tree for the formatted docstring, with "directive"
> elements containing #CDATA (=character data, i.e., a string) like::
>
> <directive tag="keywords"> ... </directive>
>
> Then a specialized parser could run the generic parser, and then
> replace all the directive elements with some other elements..
If the extension/directive wants to do this, fine. But what if it just wants
to wrap the normal behaviour of the parser with a new tag?
> The only domain I care about is formatted docstrings.
That's a big enough domain with enough controversy to make the feature
necessary. See the archives. See this discussion! :-) It's been going on for
years, you know.
> As for running out of characters to use as syntax, that's one of the
> reasons I don't like *colorizing* `like this`...
Then implement a POD-like language or a JavaDoc-like language or whatever.
This is clearly the dividing line: do you "buy in" to the
Setext/StructuredText concept or not?
> I think that my target is a much more lightweight markup language than
> you're talking about.. or at least less powerful. I really don't see
> the need for most of those things in docstrings.
Again, read through the archives. Everyone has different opinions, everyone
wants different levels of control. If you don't want to use a particular
feature, don't. But someone else does. Please don't limit *me*.
It is my opinion that incomplete, minimal markup schemes are doomed to
failure, because *your* minimal set of features doesn't match *my* set or
*anybody else's*. At least at the discussion level. ;-)
> > Say we add an 'SQL' extension to the parser, which performs a
> > database query and inserts the results.
>
> Wouldn't this totally violate making the docstring readable? And when
> would you ever want to use this when writing a docstring??
Just an example, not a serious proposal. C'mon, lighten up!
> > .. warning::
> >
> > Don't *ever* press the `Self-Destruct` button.
> > If you do, you'll be sorry.
>
> This could be implemented as a field.
Then fields can't be restricted to the ends of docstrings -- I want a
warning in the middle! And what do fields *do*? Seems to me they're simply
descriptive, not functional. Maybe they are all we need, but please come up
with a more complete description!
> I think that external URL
> hyperlinks should be implemented with colorizing, if at all.
They're definitely required. I used readability as the overriding criterion
in making that decision. Which is more readable?
1. A hyperlink in StructuredText, inline::
I love using the "Python":http//www.python.org programming language!
(The URL has to be stuck next to the reference, whether it flows or
not. The raw text looks very different from the processed!)
2. A hyperlink in reStructuredText (based on the Setext style), indirect::
I love using the Python_ programming language!
(Note that the URL can be anywhere: next to the reference, at the
end of the section, or at the end of the document. And the URL can be
referred to multiple times: Python_.)
.. _Python: http://www.python.org
> I don't
> think that internal hyperlink targets make sense for docstrings.
This comes back to the semantics or usage of docstrings, something that I'm
trying to avoid. How long can a docstring be?
> I don't think that comments are necessary for docstrings. If you really
> want, you can include a Python comment before or after the docstring.
Comments are a freebie from the '.. ' syntax. Not necessary, but useful.
> Alternatively, comments could be done via colorizing..
Please, no.
> > The cornerstone of the Setext/StructuredText-like approach is that
> > the raw text should be as readable as possible, even to the
> > uninitiated.
>
> I don't see how directives win here.
>
> If anything, it seems like they
> will make it harder to read by the uninitiated, given the power of
> directives to use almost arbitrary syntax..
You seem to think that typing '.. some-directive::' will magically make
something happen. Not so. You'd have to first *implement* the directive, not
a trivial task.
I was referring to '@' and (especially) 'X<>', about the readability
cornerstone. OTOH, directives are readable by way of being explicit. If we
want a digibloofer construct, we say '.. digibloofer::' (having paid the
price for such impertinence by implementing the digibloofer-parsing
extension first, of course ;-).
> However, the idea that "raw text should be as readable as possible,
> even to the uninitiated" is a *goal* of mine, but not a cornerstone.
> Perhaps a cornerstone would be::
>
> Raw text should be readable, even by the uninitiated.
I don't see the distinction.
> There are a lot of conflicting goals in designing a markup language,
> and making it as readable as possible is by no means my most
> fundamental goal.
I'd say, for the Setext/StructuredText approach, it *is* the most
fundamental goal. If it's not yours, you'll save yourself a lot of grief by
using XML or TeX.
> In the case of colorizing, I believe that
> colorizing should *never* be necessary to the understanding of a
> docstring.. i.e., you should be able to strip away all colorizing, and
> still understand what it says.
In the Setext/StructuredText approach, you shouldn't *have* to strip away
anything. It should just be obvious, or at least unobtrusive.
> I guess that perhaps what it comes down to is that I am *not*
> necessarily trying to design a Setext/StructuredText-like language.
Aha! :-)
> I'm trying to design a markup language that is optimal for writing
> Python docstrings.
A noble goal. Please use a different name for what you're doing and let's be
done with it. Lots of room for competition (the field's wide open right now!
;-). The more the merrier.
> In my mind, the only advantage of using
> `quotes` over C{curly braces} is that quotes are easier to ignore..
Precisely. Also, `quotes` have the connotation of, well, quoting.
... And a vigorous debate was had by all. Me and Edward, anyway. Thank you,
sir.
/DG