[Doc-SIG] XML Conversion Update

Scott Cotton scott@chronis.pobox.com
Thu, 26 Aug 1999 17:55:23 -0400

On Thu, Aug 26, 1999 at 05:14:29PM -0400, Fred L. Drake, Jr. wrote:
>   Last week I promised on the Python list to describe the current
> status of the conversion to SGML/XML.  Here it is!
>   I'm currently able to parse all the LaTeX markup and generate either 
> XML or SGML.  The structure of the output is very similar to the input 
> structure, but a number of minor improvements are made.  The
> improvements are mostly very localized and have more to do with
> keeping the markup from getting to complex and disjointed, and
> eliminating some bogosities.


>   I am not at all decided on a DTD to use.  I see three options:
>   1.  DocBook -- this has been developed and heavily use-tested by a
>       number of corporate users, and is known to be good for technical 
>       documentation.  There are tools and stylesheets available to
>       convert from DocBook to HTML and printed formats.  We'd probably 
>       need to specialize it, but it's designed for that.  Konrad
>       Hinsen has already developed one customization that he's using
>       to document Python modules, and there's an initiative to create
>       a common extension for documenting OO constructs.  I've asked
>       Konrad for some sample documentation so I can see how it
>       actually works out.  My concern with DocBook is that the markup
>       may be a bit on the "heavy" side; I don't want the document
>       source to be so markup-heavy that I'm the only one to work on
>       them.

I personally am not a fan of this, since it seems like it could limit the
contributors to those willing to learn DocBook, which, at a glance, looks
much more complicated than learning a standard way to produce python docs.

>   2.  Create something similar to what we had in LaTeX, but with fewer 
>       warts.  This is appealing because the conversion would be done
>       sooner.  However, new stylesheets would be needed, slowing down
>       the usefulness of the result.  It would also be the easiest to
>       adopt for people already familiar with the current markup.

This sounds appealing.


>   I'd like to see some discussion on what should be done and what
> needs to be done.  From where I sit, the most important thing is to
> make sure we can maintain a high level of semantic markup (hopefully
> making further improvements over what we already have), with
> generation of hypertext (HTML, info, whatever) being the next most
> important thing.  Typeset documents are a requirement, but aren't as
> high up the list.

From my perspective, what's most important is a *simple*, well-documented
and authoritative documentation markup.  The more people who can easily
produce docs for new code, the more documentation their will be, and a
standard would facilitate sharing more documentation in everyone's favorite

With some kind of flexible-but-not-too-complex dtd, I'd probably work on
producing python docs in all the formats that I'd like to see, such as vim
tags and man pages (not that i liked the recent rant about the latter on
c.l.p, but I would like and use and produce or help produce these formats if
the dtd structure is simple and the authoritative text easy to parse)