[PYTHON DOC-SIG] Re: Setext followup (Re: Automatic Documentation)

Robin Friedrich friedric@rose.rsoc.rockwell.com
Tue, 30 Jul 1996 11:16:23 -0500

Before we get started let me comment that setext isn't our idea but
rather an adoption of an existing protocol to accomplish what we
thought was a common goal. Namely, to come up with a lightweight,
unobtusive markup standard which could be used in python doc strings.
They would be sufficient to allow automatic documentation tools to
generate nice output in several common documentation formats.
The setext protocol originated with Tony Sanders several years ago. 
See  http://www.BSDI.COM/setext/

|> From tibbs@geog.gla.ac.uk Tue Jul 30 06:38 CDT 1996
|> X-Mailer: exmh version 1.6.7 5/3/96
|> To: Robin Friedrich <friedric@rose.rsoc.rockwell.com>
|> From: Tony J Ibbs (Tibs) <gaga50@udcf.gla.ac.uk>
|> Reply-To: gaga50@udcf.gla.ac.uk
|> Subject: Re: Setext followup (Re: Automatic Documentation) 
|> Mime-Version: 1.0
|> Date: Tue, 30 Jul 96 12:39:07 +0100
|> Sender: tibbs@geog.gla.ac.uk
|> X-Mts: smtp
|> Well, whilst I'm glad to see this is being worked on, I do have a couple of
|> comments.
|> Doing this backwards, firstly, comments on the proposed typotags.
|> Don't forget that people will want to do more complex lists than you
|> seem to be coping with at the moment.
|> For instance:
|> 	This is an initial list:
|> 		o first item
|> 		o second item
|> 		o third item is a sublist
|> 			- which contains
|> 			- two things
|> 		o last item
|> 	and if that doesn't work, we need another list
|> 		o of which this is the first item
|> 		o and this is the second item
|> 		  and this is its second paragraph
|> This is a comment from experience of doing something similar within the company
|> that pays me - we don't support the sublist or the multuple paragraphs in a list, or the two lists in one documentation, and all of these are fairly horrible restrictions.

This can be added in the future without difficulty. Although I don't expect 
doc strings to get that elaborate. doc strings aren't intended to replace 
a formal users guide in the case of complicated applications.

|> So - the motto is, one needs to (a) say "start a list" and (b) say "end a list"
|> to have proper support.

OK please propose something specific.

|> You don't say how all those funny characters get inserted into the text when you want them (and not their tag meaning). And this <em>will</em> lead to
|> problems with people forgetting whatever escape sequence/mechanism you choose.

The typotags have fairly rigid regular expressions and I feel it's unlikely
that they get confused. I've been using them for several weeks and it hasn't
been a difficulty at all. Time will help in refining the regex patterns.

|> OK - that was the first point - that the system you propose is not complex enough (and that's a response on almost no thought).

Something complex is exactly what we didn't want. I want to encourage 
people to use these tags. The best way of doing that is make it very easy and 
logical, otherwise people won't bother to use it.

|> My second point is actually to argue against this exact approach - this is
|> unfortunately presumably repeating an argument in the SIG (which I was sure I
|> had signed up to ages ago - obviously not, which explains why it has been so
|> silent). I would strongly suggest that you are <em>not</em> going to come up
|> with solutions for all the formatting that users want, and sometimes need.

So. That's not the intent. What's the harm in having a simple, workable system

|> It is also very likely, with an adhoc design like this, that one will "paint onself into a corner" in some way - the obvious example of this is if it is
|> decided later that a new typotag is needed, but, oh dear, choosing any new character will break existing doc strings. (this actually ties in with the comments on the main list recently, saying essentially "I'm not terribly keen on X, Y or Z in Python, but when I try to work round it, I get trouble - gosh, language design is difficult, and Guido is clever"). Designing a new markup language (which is what you are doing) <em>is</em> difficult. Designing one that can be future proof is even harder.

As the preamble states this is not an ad hoc design.  And extending the typotags
in the future shouldn't be an issue. I see no reason why adding a new format 
would break anything. Doc strings don't break, they just may not pick up 
correctly in the autogeneration tool(s) that's all. The worse thing I've
seen so far is that setext wasn't marked as expected, but that's hardly a 
crisis and is easily fixed.

|> When the company that pays me was designing its documentation comments, we were using TeX for documentation. I stronly argued that we should use TeX commands as the markup, rather than trying to grow our own - after all, there was a lot more experience in there. I lost then.
|> Nowadays, I would recommend HTML - it is simpler than Tex (and how!), easier to use, easier to `read' when looking at source code, and browsers are easier to maintain than TeX. The "everywhereness" of HTML means that most programmers will have written it (especially with so many Python apps seeming to be net related these days!). The only change I might make would be to allow blank lines to signify "<p>" - ie, new paragraph - since this seems reasonably intuitive - but I can argue against that almost equally well.

Setext grew out of the need to avoid the ugliness of HTML. I maintain that
HTML markup is too verbose for doc strings, and clutters the reading of them 
in code. Especially for hyperlinks. I found the reference approach setext a
far sight better that placing junk like this in <A HREF="http://www.python.org/sigs/doc-sig/">doc strings</A>. Instead, I can use doc_strings_ which flow uninterupted
by the http reference.

.. _doc_strings http://www.python.org/sigs/doc-sig/

|> And HTML seems to support the right sort of subset of markup - it shouldn't be hard to translate into n/troff, [La]TeX, Digital Runoff, whatever else one wants. Oh, and it copes with URLs just fine.
|> (it also gives you more than one style of list, especially if you use the Netscape "type" extension).
|> Is it harder to read than the setext proposal? Obviously a matter of opinion (ref the arguments about indentation in Python!) - but many of the setext things are arbitrary anyway  (why not use "*" for `emphasis', not "**"? that would fit with the normal email/ASCII convention? - oh, one can use "@" or  something for lists if one must). And you only give bold or italic - why not give us "emphasis" which is presented in a device independent manner?
|> Please feel free to quote any of this - I will look into trying to join the
|> SIG again...
|> Tibs
|> -------------------------------------------------------------------------------
|> Tony J Ibbs (Tibs)                                      [Funded by Laser-Scan]
|> Dept. of Geog. & Topo. Science, Uni. of Glasgow, GLASGOW  G12 8QQ  Scotland
|> Tel:   (+44)141-330-6649          Fax: (+44)141-330-4894
|> Email: gaga50@udcf.gla.ac.uk      - goes direct to here, but I may not be here
|>        tony@lsl.co.uk             - forwards to where I am, normally works
|>        T.Ibbs@geog.gla.ac.uk      - forwards via gaga50, but is prettier
OK I will :-)

In summary I remain unconvinced that any of the above mentioned issues are
problems with using setext in doc strings.

-Robin Friedrich

DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org