[PYTHON DOC-SIG] Re: Setext followup (Re: Automatic Documentation)
Before we get started let me comment that setext isn't our idea but rather an adoption of an existing protocol to accomplish what we thought was a common goal. Namely, to come up with a lightweight, unobtusive markup standard which could be used in python doc strings. They would be sufficient to allow automatic documentation tools to generate nice output in several common documentation formats. The setext protocol originated with Tony Sanders several years ago. See http://www.BSDI.COM/setext/ |> From tibbs@geog.gla.ac.uk Tue Jul 30 06:38 CDT 1996 |> X-Mailer: exmh version 1.6.7 5/3/96 |> To: Robin Friedrich <friedric@rose.rsoc.rockwell.com> |> From: Tony J Ibbs (Tibs) <gaga50@udcf.gla.ac.uk> |> Reply-To: gaga50@udcf.gla.ac.uk |> Subject: Re: Setext followup (Re: Automatic Documentation) |> Mime-Version: 1.0 |> Date: Tue, 30 Jul 96 12:39:07 +0100 |> Sender: tibbs@geog.gla.ac.uk |> X-Mts: smtp |> |> Well, whilst I'm glad to see this is being worked on, I do have a couple of |> comments. |> |> Doing this backwards, firstly, comments on the proposed typotags. |> |> Don't forget that people will want to do more complex lists than you |> seem to be coping with at the moment. |> |> For instance: |> |> This is an initial list: |> |> o first item |> o second item |> o third item is a sublist |> - which contains |> - two things |> o last item |> |> and if that doesn't work, we need another list |> |> o of which this is the first item |> o and this is the second item |> |> and this is its second paragraph |> |> This is a comment from experience of doing something similar within the company |> that pays me - we don't support the sublist or the multuple paragraphs in a list, or the two lists in one documentation, and all of these are fairly horrible restrictions. This can be added in the future without difficulty. Although I don't expect doc strings to get that elaborate. doc strings aren't intended to replace a formal users guide in the case of complicated applications. |> |> So - the motto is, one needs to (a) say "start a list" and (b) say "end a list" |> to have proper support. OK please propose something specific. |> |> You don't say how all those funny characters get inserted into the text when you want them (and not their tag meaning). And this <em>will</em> lead to |> problems with people forgetting whatever escape sequence/mechanism you choose. The typotags have fairly rigid regular expressions and I feel it's unlikely that they get confused. I've been using them for several weeks and it hasn't been a difficulty at all. Time will help in refining the regex patterns. |> |> OK - that was the first point - that the system you propose is not complex enough (and that's a response on almost no thought). Something complex is exactly what we didn't want. I want to encourage people to use these tags. The best way of doing that is make it very easy and logical, otherwise people won't bother to use it. |> |> My second point is actually to argue against this exact approach - this is |> unfortunately presumably repeating an argument in the SIG (which I was sure I |> had signed up to ages ago - obviously not, which explains why it has been so |> silent). I would strongly suggest that you are <em>not</em> going to come up |> with solutions for all the formatting that users want, and sometimes need. So. That's not the intent. What's the harm in having a simple, workable system now? |> It is also very likely, with an adhoc design like this, that one will "paint onself into a corner" in some way - the obvious example of this is if it is |> decided later that a new typotag is needed, but, oh dear, choosing any new character will break existing doc strings. (this actually ties in with the comments on the main list recently, saying essentially "I'm not terribly keen on X, Y or Z in Python, but when I try to work round it, I get trouble - gosh, language design is difficult, and Guido is clever"). Designing a new markup language (which is what you are doing) <em>is</em> difficult. Designing one that can be future proof is even harder. |> As the preamble states this is not an ad hoc design. And extending the typotags in the future shouldn't be an issue. I see no reason why adding a new format would break anything. Doc strings don't break, they just may not pick up correctly in the autogeneration tool(s) that's all. The worse thing I've seen so far is that setext wasn't marked as expected, but that's hardly a crisis and is easily fixed. |> When the company that pays me was designing its documentation comments, we were using TeX for documentation. I stronly argued that we should use TeX commands as the markup, rather than trying to grow our own - after all, there was a lot more experience in there. I lost then. |> |> Nowadays, I would recommend HTML - it is simpler than Tex (and how!), easier to use, easier to `read' when looking at source code, and browsers are easier to maintain than TeX. The "everywhereness" of HTML means that most programmers will have written it (especially with so many Python apps seeming to be net related these days!). The only change I might make would be to allow blank lines to signify "<p>" - ie, new paragraph - since this seems reasonably intuitive - but I can argue against that almost equally well. Setext grew out of the need to avoid the ugliness of HTML. I maintain that HTML markup is too verbose for doc strings, and clutters the reading of them in code. Especially for hyperlinks. I found the reference approach setext a far sight better that placing junk like this in <A HREF="http://www.python.org/sigs/doc-sig/">doc strings</A>. Instead, I can use doc_strings_ which flow uninterupted by the http reference. .. _doc_strings http://www.python.org/sigs/doc-sig/ |> |> And HTML seems to support the right sort of subset of markup - it shouldn't be hard to translate into n/troff, [La]TeX, Digital Runoff, whatever else one wants. Oh, and it copes with URLs just fine. |> |> (it also gives you more than one style of list, especially if you use the Netscape "type" extension). |> |> Is it harder to read than the setext proposal? Obviously a matter of opinion (ref the arguments about indentation in Python!) - but many of the setext things are arbitrary anyway (why not use "*" for `emphasis', not "**"? that would fit with the normal email/ASCII convention? - oh, one can use "@" or something for lists if one must). And you only give bold or italic - why not give us "emphasis" which is presented in a device independent manner? |> |> Please feel free to quote any of this - I will look into trying to join the |> SIG again... |> |> Tibs |> ------------------------------------------------------------------------------- |> Tony J Ibbs (Tibs) [Funded by Laser-Scan] |> Dept. of Geog. & Topo. Science, Uni. of Glasgow, GLASGOW G12 8QQ Scotland |> Tel: (+44)141-330-6649 Fax: (+44)141-330-4894 |> Email: gaga50@udcf.gla.ac.uk - goes direct to here, but I may not be here |> tony@lsl.co.uk - forwards to where I am, normally works |> T.Ibbs@geog.gla.ac.uk - forwards via gaga50, but is prettier |> |> |> OK I will :-) In summary I remain unconvinced that any of the above mentioned issues are problems with using setext in doc strings. -Robin Friedrich ================= DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org =================
Robin Friedrich wrote:
Before we get started let me comment that setext isn't our idea but rather an adoption of an existing protocol to accomplish what we thought was a common goal. Namely, to come up with a lightweight, unobtusive markup standard which could be used in python doc strings. They would be sufficient to allow automatic documentation tools to generate nice output in several common documentation formats. The setext protocol originated with Tony Sanders several years ago. See http://www.BSDI.COM/setext/
|> From tibbs@geog.gla.ac.uk Tue Jul 30 06:38 CDT 1996 |> X-Mailer: exmh version 1.6.7 5/3/96 |> To: Robin Friedrich <friedric@rose.rsoc.rockwell.com> |> From: Tony J Ibbs (Tibs) <gaga50@udcf.gla.ac.uk> |> Reply-To: gaga50@udcf.gla.ac.uk |> Subject: Re: Setext followup (Re: Automatic Documentation) |> Mime-Version: 1.0 |> Date: Tue, 30 Jul 96 12:39:07 +0100 |> Sender: tibbs@geog.gla.ac.uk |> X-Mts: smtp |> |> Well, whilst I'm glad to see this is being worked on, I do have a couple of |> comments. |> |> Doing this backwards, firstly, comments on the proposed typotags. |> |> Don't forget that people will want to do more complex lists than you |> seem to be coping with at the moment. |> |> For instance: |> |> This is an initial list: |> |> o first item |> o second item |> o third item is a sublist |> - which contains |> - two things |> o last item |> |> and if that doesn't work, we need another list |> |> o of which this is the first item |> o and this is the second item |> |> and this is its second paragraph |> |> This is a comment from experience of doing something similar within the company |> that pays me - we don't support the sublist or the multuple paragraphs in a list, or > |> the two lists in one documentation, and all of these are fairly horrible restrictions.
This can be added in the future without difficulty.
Hm. With setext?
Although I don't expect doc strings to get that elaborate. doc strings aren't intended to replace a formal users guide in the case of complicated applications.
I disagree. I'd rather write my documentation in my module, so that I only have *one* source.
|> |> So - the motto is, one needs to (a) say "start a list" and (b) say "end a list" |> to have proper support.
OK please propose something specific.
See my StructuredText module. Note that with StructuredText, no extra markup is needed. Rather, indentation indicates structure, as is most natural for humans ans computers alike.
|> |> You don't say how all those funny characters get inserted into the text when you want > |> them (and not their tag meaning). And this <em>will</em> lead to |> problems with people forgetting whatever escape sequence/mechanism you choose.
The typotags have fairly rigid regular expressions and I feel it's unlikely that they get confused. I've been using them for several weeks and it hasn't been a difficulty at all. Time will help in refining the regex patterns.
|> |> OK - that was the first point - that the system you propose is not complex enough (and > |> that's a response on almost no thought).
Something complex is exactly what we didn't want. I want to encourage people to use these tags. The best way of doing that is make it very easy and logical, otherwise people won't bother to use it.
|> |> My second point is actually to argue against this exact approach - this is |> unfortunately presumably repeating an argument in the SIG (which I was sure I |> had signed up to ages ago - obviously not, which explains why it has been so |> silent). I would strongly suggest that you are <em>not</em> going to come up |> with solutions for all the formatting that users want, and sometimes need.
So. That's not the intent. What's the harm in having a simple, workable system now?
I think we can have our cake and eat it too. Fairly complex formatting *can* be achieved with simple human readable ascii source.
|> It is also very likely, with an adhoc design like this, that one will "paint onself > > |> into a corner" in some way - the obvious example of this is if it is |> decided later that a new typotag is needed, but, oh dear, choosing any new character > > |> will break existing doc strings. (this actually ties in with the comments on the main > |> list recently, saying essentially "I'm not terribly keen on X, Y or Z in Python, but > |> when I try to work round it, I get trouble - gosh, language design is difficult, and > |> Guido is clever"). Designing a new markup language (which is what you are doing) > |> <em>is</em> difficult. Designing one that can be future proof is even harde |>
As the preamble states this is not an ad hoc design. And extending the typotags in the future shouldn't be an issue. I see no reason why adding a new format would break anything. Doc strings don't break, they just may not pick up correctly in the autogeneration tool(s) that's all. The worse thing I've seen so far is that setext wasn't marked as expected, but that's hardly a crisis and is easily fixed.
|> When the company that pays me was designing its documentation comments, we were using > |> TeX for documentation. I stronly argued that we should use TeX commands as the markup, > |> rather than trying to grow our own - after all, there was a lot more experience in > |> there. I lost then. |> |> Nowadays, I would recommend HTML - it is simpler than Tex (and how!), easier to use, > |> easier to `read' when looking at source code, and browsers are easier to maintain than > |> TeX. The "everywhereness" of HTML means that most programmers will have written it > |> (especially with so many Python apps seeming to be net related these days!). The only > |> change I might make would be to allow blank lines to signify "<p>" - ie, new paragraph > |> - since this seems reasonably intuitive - but I can argue against tha
Setext grew out of the need to avoid the ugliness of HTML. I maintain that HTML markup is too verbose for doc strings, and clutters the reading of them in code. Especially for hyperlinks. I found the reference approach setext a far sight better that placing junk like this in <A > HREF="http://www.python.org/sigs/doc-sig/">doc strings</A>. Instead, I can use > doc_strings_ which flow unint by the http reference.
.. _doc_strings http://www.python.org/sigs/doc-sig/
It is critical IMHO that doc strings *not* contain explicit markup. They should be easily human readable, while supporting clever auto-markup.
|> |> And HTML seems to support the right sort of subset of markup - it shouldn't be hard to > |> translate into n/troff, [La]TeX, Digital Runoff, whatever else one wants. Oh, and it > |> copes with URLs just fine. |> |> (it also gives you more than one style of list, especially if you use the Netscape > |> "type" extension). |> |> Is it harder to read than the setext proposal? Obviously a matter of opinion (ref the > |> arguments about indentation in Python!) - but many of the setext things are arbitrary > |> anyway (why not use "*" for `emphasis', not "**"? that would fit with the normal > |> email/ASCII convention? - oh, one can use "@" or something for lists if one must). And > |> you only give bold or italic - why not give us "emphasis" which is presented in a > |> device independent manner?
I'm not wild about setext myself.
OK I will :-)
In summary I remain unconvinced that any of the above mentioned issues are problems with using setext in doc strings.
I agree Jim -- Jim Fulton Digital Creations jim@digicool.com 540.371.6909 ## Python is my favorite language ## ## http://www.python.org/ ## ================= DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org =================
participants (2)
-
friedric@rose.rsoc.rockwell.com -
Jim Fulton