[PYTHON DOC-SIG] Re: Setext followup (Re: Automatic Documentation)

Jim Fulton jim.fulton@digicool.com
Mon, 05 Aug 1996 16:14:46 -0400

Robin Friedrich wrote:
> Before we get started let me comment that setext isn't our idea but
> rather an adoption of an existing protocol to accomplish what we
> thought was a common goal. Namely, to come up with a lightweight,
> unobtusive markup standard which could be used in python doc strings.
> They would be sufficient to allow automatic documentation tools to
> generate nice output in several common documentation formats.
> The setext protocol originated with Tony Sanders several years ago.
> See  http://www.BSDI.COM/setext/
> |> From tibbs@geog.gla.ac.uk Tue Jul 30 06:38 CDT 1996
> |> X-Mailer: exmh version 1.6.7 5/3/96
> |> To: Robin Friedrich <friedric@rose.rsoc.rockwell.com>
> |> From: Tony J Ibbs (Tibs) <gaga50@udcf.gla.ac.uk>
> |> Reply-To: gaga50@udcf.gla.ac.uk
> |> Subject: Re: Setext followup (Re: Automatic Documentation)
> |> Mime-Version: 1.0
> |> Date: Tue, 30 Jul 96 12:39:07 +0100
> |> Sender: tibbs@geog.gla.ac.uk
> |> X-Mts: smtp
> |>
> |> Well, whilst I'm glad to see this is being worked on, I do have a couple of
> |> comments.
> |>
> |> Doing this backwards, firstly, comments on the proposed typotags.
> |>
> |> Don't forget that people will want to do more complex lists than you
> |> seem to be coping with at the moment.
> |>
> |> For instance:
> |>
> |>      This is an initial list:
> |>
> |>              o first item
> |>              o second item
> |>              o third item is a sublist
> |>                      - which contains
> |>                      - two things
> |>              o last item
> |>
> |>      and if that doesn't work, we need another list
> |>
> |>              o of which this is the first item
> |>              o and this is the second item
> |>
> |>                and this is its second paragraph
> |>
> |> This is a comment from experience of doing something similar within the company
> |> that pays me - we don't support the sublist or the multuple paragraphs in a list, or > |> the two lists in one documentation, and all of these are fairly horrible restrictions.
> This can be added in the future without difficulty. 

Hm.  With setext?

> Although I don't expect
> doc strings to get that elaborate. doc strings aren't intended to replace
> a formal users guide in the case of complicated applications.

I disagree.  I'd rather write my documentation in my module, so that I
have *one* source.
> |>
> |> So - the motto is, one needs to (a) say "start a list" and (b) say "end a list"
> |> to have proper support.
> OK please propose something specific.

See my StructuredText module.  Note that with StructuredText, no extra
markup is
needed.  Rather, indentation indicates structure, as is most natural for
humans ans computers alike.

> |>
> |> You don't say how all those funny characters get inserted into the text when you want > |> them (and not their tag meaning). And this <em>will</em> lead to
> |> problems with people forgetting whatever escape sequence/mechanism you choose.
> The typotags have fairly rigid regular expressions and I feel it's unlikely
> that they get confused. I've been using them for several weeks and it hasn't
> been a difficulty at all. Time will help in refining the regex patterns.
> |>
> |> OK - that was the first point - that the system you propose is not complex enough (and > |> that's a response on almost no thought).
> Something complex is exactly what we didn't want. I want to encourage
> people to use these tags. The best way of doing that is make it very easy and
> logical, otherwise people won't bother to use it.
> |>
> |> My second point is actually to argue against this exact approach - this is
> |> unfortunately presumably repeating an argument in the SIG (which I was sure I
> |> had signed up to ages ago - obviously not, which explains why it has been so
> |> silent). I would strongly suggest that you are <em>not</em> going to come up
> |> with solutions for all the formatting that users want, and sometimes need.
> So. That's not the intent. What's the harm in having a simple, workable system
> now?

I think we can have our cake and eat it too.  Fairly complex formatting
be achieved with simple human readable ascii source.
> |> It is also very likely, with an adhoc design like this, that one will "paint onself > > |> into a corner" in some way - the obvious example of this is if it is
> |> decided later that a new typotag is needed, but, oh dear, choosing any new character > > |> will break existing doc strings. (this actually ties in with the comments on the main > |> list recently, saying essentially "I'm not terribly keen on X, Y or Z in Python, but > |> when I try to work round it, I get trouble - gosh, language design is difficult, and > |> Guido is clever"). Designing a new markup language (which is what you are doing) > |> <em>is</em> difficult. Designing one that can be future proof is even harde
> |>
> As the preamble states this is not an ad hoc design.  And extending the typotags
> in the future shouldn't be an issue. I see no reason why adding a new format
> would break anything. Doc strings don't break, they just may not pick up
> correctly in the autogeneration tool(s) that's all. The worse thing I've
> seen so far is that setext wasn't marked as expected, but that's hardly a
> crisis and is easily fixed.
> |> When the company that pays me was designing its documentation comments, we were using > |> TeX for documentation. I stronly argued that we should use TeX commands as the markup, > |> rather than trying to grow our own - after all, there was a lot more experience in > |> there. I lost then.
> |>
> |> Nowadays, I would recommend HTML - it is simpler than Tex (and how!), easier to use, > |> easier to `read' when looking at source code, and browsers are easier to maintain than > |> TeX. The "everywhereness" of HTML means that most programmers will have written it > |> (especially with so many Python apps seeming to be net related these days!). The only > |> change I might make would be to allow blank lines to signify "<p>" - ie, new paragraph > |> - since this seems reasonably intuitive - but I can argue against tha
> Setext grew out of the need to avoid the ugliness of HTML. I maintain that
> HTML markup is too verbose for doc strings, and clutters the reading of them
> in code. Especially for hyperlinks. I found the reference approach setext a
> far sight better that placing junk like this in <A > HREF="http://www.python.org/sigs/doc-sig/">doc strings</A>. Instead, I can use > doc_strings_ which flow unint
> by the http reference.
> .. _doc_strings http://www.python.org/sigs/doc-sig/

It is critical IMHO that doc strings *not* contain explicit markup. 
They should
be easily human readable, while supporting clever auto-markup.

> |>
> |> And HTML seems to support the right sort of subset of markup - it shouldn't be hard to > |> translate into n/troff, [La]TeX, Digital Runoff, whatever else one wants. Oh, and it > |> copes with URLs just fine.
> |>
> |> (it also gives you more than one style of list, especially if you use the Netscape > |> "type" extension).
> |>
> |> Is it harder to read than the setext proposal? Obviously a matter of opinion (ref the > |> arguments about indentation in Python!) - but many of the setext things are arbitrary > |> anyway  (why not use "*" for `emphasis', not "**"? that would fit with the normal > |> email/ASCII convention? - oh, one can use "@" or  something for lists if one must). And > |> you only give bold or italic - why not give us "emphasis" which is presented in a > |> device independent manner?

I'm not wild about setext myself.

> OK I will :-)
> In summary I remain unconvinced that any of the above mentioned issues are
> problems with using setext in doc strings.

I agree


Jim Fulton         Digital Creations
jim@digicool.com   540.371.6909
## Python is my favorite language ##
##     http://www.python.org/     ##

DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org