[Doc-SIG] Where to go from here

Tony J Ibbs (Tibs) tony@lsl.co.uk
Tue, 27 Mar 2001 10:58:50 +0100


Edward Loper wrote:
> Guido's input has raised some questions about whether we're going
> in the right direction...

To say the least. One can't help feeling that he might have objected
sooner if he were going to object so much (like, for instance, last time
round the loop when ST was decided on). Grump, moan, whine.

> But he's made it clear that he'll keep an open mind,
> and seriously consider any real specs we come up with.

So long as we gird our loins and don't just give up - I was feeling
pretty dispirited about all of this yesterday (I'm sorry, Guido, but to
be told "this must be bad because it shares part of the name of
something else" is not polite - it's exactly like saying one had a bad
experience with regexp so one doesn't like sre).

Anyway, putting my rational head back on:

> So I propose we do the following:
>
> - continue trying to come up with a concrete, formal spec

Agreed. I *still* don't think we're far off one, Guido's pessimism
despite (and he still hasn't had a chace to *look* at what we've been
doing).

> - drop the idea of maintaining compatibility with STNG, for
>   now.

We'd all like that. It disturbs me a little that we can consider it so
easily, though, given how powerful the arguments for keeping STClassic
and STNG compatibility were in the past.

>    Once STNG sees how much cooler our markup language
>    is, we can convert them. ;)  Maybe we should even come up
>    with a new name, so that other people who have become
>    embittered with STclassic won't take it out on us. :)

and so Guido won't be prejudiced because of the name (sorry, I'll try to
stop grumping).

Of course, the obvious name would be "pydoc", but that's rather taken...

Perhaps we should choose "pytext", by analogy with the grandfather
format, setext.

>  - STminus will focus purely on coming up with a formal
>    description, and drop its goals of unifying STNG/STpy.

Makes sense. So URIs are now delimited by '<..>', yes?

> - Focus on the goal of making a *real* markup language that is
>   *lightweight* and simple to read/write.

But you are not going to get a real markup language (for any sense of
"real" that I understand) if your start and end delimiters are the
same - and I don't see how we can compromise on that.

We still have the problem that if we don't have *really* lightweight
markup, people won't do it, and that something akin to what people do in
email (i.e., something like ST/STpy, I'm afraid) is the best bet for
that - unless you're proposing to start the discussion that started back
in 1997 all over again.

> - Once we have a real specification (hopefully in a couple
>   weeks), we can talk to Guido/others about whether it's
>   acceptable.  It's unreasonable to expect Guido to make
>   judgements when the ST stuff is in the state of flux it's
>   in now.

I agree we need to have a specification - that's what we've been working
towards.
But I think the correct is still a PEP, and Guido is only one of the
people who vote on PEPs. He's certainly the *most important* person, but
I can't (I refuse to) believe that he doesn't change his mind on
occasion.

I am *very* scared that the last time round the Doc-SIG loop got this
close to having something that worked, and got kiboshed by Spam8. Let's
(please) not let that happen again.

> Dropping STNG compatibility will allow us to consider a number
> of options that I hadn't brought up before..  For example, I think
> we might want to replace '--' with '---' as the description list
> indicator, since people *do* use '--' in text (I know I do, and
> apparently Guido does too).  And I think we should drop 'o' as
> a bullet character.  etc..

I agree with dropping 'o' - can we add '+' as an alternative?

Personally I dislike ' --- ' as the descriptive list delimiter, but not
enough to jump up an down too much - and since you're also wanting to
use '--' as a hyphen (presumably actually an m-dash), I'll go for it.
Descriptive lists *are* meant to stand out, after all.

One problem is that Guido's style guide suggests using ' -- ' for
descriptive lists::

    Keyword arguments:
    real -- the real part (default 0.0)
    imag -- the imaginary part (default 0.0)

so we'd need to get him to change that (presumably not a problem if the
PEP were accepted).

> As for colorization, java-mode does just fine colorizing javadoc
> comments, so I don't see how it's a problem *in principle*, just
> a problem of someone figuring out what to tell emacs (I'm sure
> emacs could be told to colorize tripple-quoted-strings correctly
> if someone really wanted to figure out how to..

See http://www.python.org/emacs/python-mode/faq.html for an explanation
of the problems.

> I've just been using the work-around of backslashing
> all double quotes in  tripple-quoted-strings, which
> doesn't affect their value, and makes them colorize
> correctly)

Hmm. Ugly, but worth mentioning as a tip (although single quotes can
cause problems too).

Peter Funk suggested:
> Especially heading recognition in ST sucks.

I dislike intensely headers in STClassic and STNG. We *might* be able to
ignore the problem if we are simply addressing docstrings.
Alternatively, there are two other ways to do it (one lightweight and
hacky, the other heavyweight and, well, different):

  1. Assume that a heading will be underlined.
     Text after a heading need not be indented
     any more than it normally would. I think
     this was suggested by David Goodger.

     For instance::

	 This is heading 1
	 =================

	 This text is within "heading 1"'s section.

	 This is subheading 2!
	 ---------------------

	 Which introduces a subsection.

	 This is subsubheading 3...
	 ~~~~~~~~~~~~~~~~~~~~~~~~~~

	 And that is surely enough depth to satisfy
       anyone using docstrings...

  2. Provide "proper" sectioning commands - for instance::

	 Section 1: Its title

	 And some text

	 Subsection: It can decide its number

	 One would also provide other appropriate "names"
	 for sections.

Option 1 seems to me more appropriate for docstrings, option 2 for
longer texts - so since we're working on docstrings, I'd go for option
1. Details to be worked out are whether one needs to get the number of
underline characters right or not (!).

Peter Funk also wrote:
> I think, a description list can be dropped alltogether.
> At least for the time being a bullet list will be enough.
> enumerated lists: ...hmmm...  I think we can also live without
> them for a try.

No and no. I vehemently disagree. And Guido's suggestion that:

> Yes!  They are darn ugly in HTML anyway.

is just plain silly - for a start, that's almost entirely down to using
default settings with poor browers (OK, IE and Netscape!), and secondly
it's *definitely* controllable by writing extra HTML code, or (horrors)
using style sheets. Let's not let the presentation of one format drive
our whole effort.

I *do* sort-of agree with Guido's point that it is a bad thing to lose
the "number" from an enumeration, though - the reason for my making this
optional in STpy was purely I think it may be difficult in HTML (and
that's somewhat more than a presentation issue). But I've always worried
about it, because of one's wish to refer back to list items by sequence
number in the surrounding text. I think this one needs thinking about.

Guido said:
> Please do look at the conventions in MoinMoin an another example!

Hmm. Last time I looked at MoinMoin I got no further than the
"traditional" use of multiple quotes to mean different things, and gave
up (despite the fact I rather like how it looks through a browser).

<goes off and looks>

Hmm. It's a mishmash of odds and ends, not designed to be read *as
text*. I'm a bit disturbed that Guido refers us to this as something
worth following up, since it seems to miss the point of what we're
trying to do (which is *not*, in the first instance, at least, to
support a Wiki).

(From a very quick scan:

possibly good ideas: they allow internal indentation in a paragraph to
have meaning. Not sure what *use* it is, but it's fun.

good ideas for a Wiki: they allow one to "optimise" URIs that end in
.jpg or .gif so that the image is included instead. Not so useful in
docstrings)

Peter Funk continued:
> I think we should aim for *very* minimalistic set of features
> and people may than add other things lateron:
>  * emphasizing of *single* words.

Edward had suggested that. I emphasise more than one word too often in
my writing to be happy with that, and I don't see it as being a problem
(if you're working from ideas of how STClassic does it, please don't!).

>  * section headings (marked up through underlining with a line of
>    hyphens or '=' and preceeded by a blank line).

Ah - I hadn't read that far - we agree, more or less.

>  * bullet item lists (which may be nested through indentation).

Module I want all three list types.

>  * References to URLs, to Mailaddresses and to Python objects.

"Mailaddresses"? I assume you mean "mailto:", which is just one form of
URI.

We had been leaving references to Python objects to later as slightly
harder and a "pydoc" type issue (one of the reasons for marking up
Python words inline is to make this easier to get right).

>  * pre formatted paragraphs for code examples, tables and such:
>    (every paragraph with mixed indentation or which starts with
>    the patterns '>>>' or '+--' should be left allone.  Only properly
>    aligned normal text paragraphs should allowed for reformatting.

This is too complex. The concepts we already had in STpy were enough -
i.e., using the '::' idea to introduce literal blocks and '>>>' to
introduce doctest blocks. Trying to "guess" based on mixed case is too
error prone (gods, is that error prone!), and I for one refuse to
countenace needing to put some strange characters in front of literal
text just to make it literal - yuck.

> Than let's try to implement this minimal set and plug this
> into Ping's pydoc and see what comes out, if running this
> on existing sources.

Well, we just about *have* implementation of our format (if people
didn't keep arguing/discussing it would probably have been out
already!). And there *is* precedence for changing the existing source
documentations, if we need to (although my own experience with a few
modules so far is that this is not a problem).

*I* would prefer to keep a more complex thing that includes
'[localrefs]' ('cos they're useful - and if we can't handle a PEP we're
not doing too well). The "labelled paragraphs" thing is too useful for
me to want to give up immediately, as well.

I'm undecided about whether we need two forms of emphasis - Edward, what
is your feeling, should we drop '*..*'?

I get left with:

* Structuring is still available by indentation, as before, although its
use is less important (and to those who've only just joined us, don't
worry, it does come out in the wash, honest)
* Blank lines delimit paragraphs, as before
* List items start paragraphs, as before
* Headings are done by underlining (three levels of heading)
* doctest blocks are introduced by >>>
* literal blocks are introduced by :: on the previous paragraph, as
before
* Emphasis is by *..* (dropping **..**, which makes life simpler).
Obviously, one can't nest *..* inside *..*
* Literals are '..' and #..# (as before)
* URIs are delimited by <..>
* URI "text" is done as "some text":<uri> - note that under the new
scheme we can safely allow optional whitespace after the colon, which
allows friendlier typing
* Standalone URIs (e.g., <www.tibsnjoan.co.uk/>) get rendered as they
look
* Local references are like [this], and refer to '..[this]' anchors as
before (which need to be at the start of a paragraph - an anchor at the
start of a line should also start a new paragraph)
* Labelled paragraphs are available - for instance::

	Author: Guido

	Arguments:
	   #fred# --- the fred'th dimension

  as before. I think these are too useful to drop.

Edward - have I forgotten anything? What do *you* think?

You propose continuing with STminus (under a new name - pytext-slim?)
with proper formal definition and minimal markup. I see that as a
continuation of what was being done, but without the ST<whatever>
constraints.

I would propose amending docutils to implement my proposals above
(modulo keeping compatibility with your work), and rewrite STpy.html as
pytext.html (pytext-fat.html?), describing it. Mainly because it's just
about there, and will provide a "test bed" to allow people to think
about which features are actually wanted.

That gives us two strings to our bow, which I think is still a good
idea.

I propose that we still maintain output to a DOM tree, as discussed
elsewhere.

Meanwhile, I think we should have hope that we can convince Guido that
he (honestly) was using a bad tool, and that he shouldn't judge
"superficially similar" things by said tool.

[[[I'm still yea-close to an alpha release of docutils. It's not put
much farther off by the new considerations, so I still propose to put
together a PEP for an altered version. But it's not going to be this
week. I may be able to take some time off next week, which would
help.]]]

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)