[DOC-SIG] Comparing SGML DTDs

Guido van Rossum guido@CNRI.Reston.Va.US
Wed, 12 Nov 1997 12:22:11 -0500


[me]
> > I'm sorry, but I'm the only one who can decide to move the tutorial
> > anywhere.  I have already expressed my sentiments towards having to
> > enter SGML manually.  Please concentrate on the task at hand,
> > e.g. the library reference manual.

[paul prescod]
> I must admit to being totally confused on several issues:
> 
>  * The title of the thread is "notes on the tutorial's markup" -- isn't
> the tutorial then the task at hand? I'm having trouble reconstructing
> the thread. I thought that was proposed as a test case by Andrew
> Kuchling. I also thought that Andrew was the tutorial maintainer (from
> some old messages in the thread). You can see how I came to the
> conclusion that he had the (moral) authority to move it to SGML.

Andrew is not the tutorial maintainer -- he has helped me tremendously
with reorganizing, but I still consider myself the author, and I think
that Andrew agrees with this view.

The thread started out in the PSA list with the subject "Python
Documentation Idea"; that thread definitely started off with doc
strings in mind (for example, there's a whole discussion on whether
gendoc is or isn't broken in that thread).

You were the first to bring up XML/SGML in that thread:

| Subject: Re: [PSA MEMBERS] Python Documentation Idea
| From: Paul Prescod <papresco@technologist.com>
| To: "psa-members@python.org" <psa-members@python.org>
| Date: Sat, 08 Nov 1997 07:06:24 -0500
| 
| Jeff Rush wrote:
| > 
| > It sounds like people want very different things.  Some want Tex output to
| > print (John S.  Zhao), some want lots of HTML files and some want an
| > interactive, command-line browser system (S. Hoon Yoon).  Actually, I'm
| > closer to Yoon's idea, but more of a hypertext reference manual and less
| > like a traditional class browser.
| 
| It is important to note that there are two standards designed
| specifically to address these three problems (and more). They are SGML,
| the international standard, and XML, the simple subset designed
| specifically for the World Wide Web by W3C, the same people who design
| new versions of HTML and other Web specifications. As an SGML/XML
| consultant and advocate, I naturally think that XML should be adopted,
| but I also think that XML makes perfect sense to someone who looks at
| the system in an unbiased way.
[...]

This still clearly refers to the library manual if you ask me; Andrew
Kuchling replied the same day with

| 	Writing new library reference documentation is something of a
| pain, because there's no special formatting for optional arguments (in
| general), default values, or keyword arguments.  Usually the
| information about default arguments and the like is simply placed in
| the text, but it's not very interesting to write or to read, and isn't
| understandable at a glance.  An XML-based scheme would make this
| information available for special formatting, and to programming
| environments.

Later (on Tue Nov 11) Andrew changed directions and subject and
somehow chose to change the subject to "[XML] Notes on the Tutorial's
markup".  This was the first crosspost to the doc sig.  I must have
missed the change of subject -- in my mind it was still about the
library and doc strings.

>  * Don't you (Guido) also have to edit the library reference manual? If
> SGML is not appropriate for the tutorial, wouldn't it be similarly
> inappropriate for the reference manual? I would have thought that the
> library reference manual is edited a lot more often than the tutorial,
> in fact. Or are you proposing that we use SGML as a step in a change
> from some TeX variant?

The difference is that the library *really* needs more structure than
latex can provide, so I'm open to suggestions.  It is also has many
contributed chapters -- ideally, anybody contributing source code
should also contribute a corresponding library section.

I think that SGML is not fit to be typed by humans -- especially since
it has so many special characters that conflict with characters that
are significant in Python code.  (SGML was designed to be typed by
humans in the age of punched cards.)  Latex has the same problem
(especially the underscore is painful).  I think something else should
be used that can be converted to SGML (or XML for all I care).  TIM,
which has only one magic character (@, which isn't used in Python)
fits the bill -- it did one or two years when I looked into it, and
it's only because of inertia (and a lot of other things that needed to
happen sooner) that I haven't started using it.

>  * Does it makes sense to think about the library reference in
> isolation? Do we really plan to keep the two docs in different formats
> indefinately? I'm not sure I would promote SGML if it means that forever
> after people will have to install two different tool chains to process
> the various Python docs. Or do the same people usually not work on the
> different documents?

Yes, I think the library reference is a separate project from the
tutorial.  I am planning to do the tutorial in FrameMaker because it
gives me as an author the best user interface for editing and the most
freedom to create nice layout, and because it is essentially a
one-author document it's no problem that not everybody can afford
FrameMaker (as long as I can generate HTML and PostScript, which I can
-- and there's even a version of Frame that can generate SGML although
I don't have it).  (Now that I've got a PC at home I may switch to MS
Word too -- that's surely democratic :-)

>  * What exactly is the concern about SGML? From what I have seen, SGML
> markup can be fully as <emph/minimal/ as @prod{TeX} variants. I'm afraid
> that XML and HTML give people the wrong impression tht SGML must be
> verbose and use redundant end tags.

I just don't like the fact that SGML makes characters that occur
frequently in Python source code like "<" and "/" special.  Also the
fact that SGML parsers that support the full syntax are either costly
in money or in resources (few sites that I know have an SGML parser
installed already; sgmllib.py doesn't cut it).  TIM, on the other
hand, was *designed* to be trivial to parse, so you can quickly write
a small Python script that converts it to any format you like.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________