[XML-SIG] Re: PyDoc/XML?

Wed, 29 Sep 1999 12:43:04 -0400

[hi doc-sig folks.  This is a conversation that has been brewing in xml-sig, 
but I think you may find it relevant]

> > I am trying to learn XML by developing my own tools. This has introduced
> > me to some of the more subtle aspects of XML and has caused me to revise
> > my opinion of what XML is. (Actually, I think my XML 'evangelists' don't
> > really know exactly what XML is and are abusing it by proposing it for
> > object databases and other somewhat mis-appropriate uses). I see XML
> > strictly as a way of marking up a 'document' to expose its structure and
> > semantics. Sure, documents are trees, like databases, but doesn't necessarily
> > imply XML is a good way to implement a database. (Basic necessities such
> > as query languages don't exist, yet).
> 
> In my experience it's really awkward to author material using
> XML/SGML-based notations.  And even with the best XML-authoring tool
> that I can conceive, there's still the problem of reading and revising
> the content later.  To me, even the best of XML documents are just
> plain *ugly*.  XML makes sense as a data interchange format, but *not*
> as an authoring format or as a database format.

It isn't ideal, but I've got used to typing in XML.  Maybe that's why I don't 
have so many of the typical first-glance reactions anymore, and I don't even 
remember having had them.  When I'm hacking at XML, I'm usually thinking in 
that mode, and I make liberal use of EMACS short-cuts to get things done 
switfly.

Of course this will never do for most normal people, and that us why no-one 
here has been advocating that Python programmers type their docs in straight 
XML.  All I was pursuing was an XML format for the eventual product, and 
user-friendly tools such as you describe for RSS would come along easily 
enough.

> In the case of
> software documentation, XML could play a role in providing a basis for
> a family of DTDs used in sharing useful information about programs,
> such as consolidating reference material from multiple sources (and
> source languages) into a common catalog or repository.  The actual
> documentation source would be designed to be convenient to the authors
> and maintainers of the programs, and translated to the common XML DTDs
> as necessary.
> 
> One related example is the "RSS" format being promulgated by
> UserLand.com and Netscape [1].  It's an XML DTD for publishing short
> news items and weblogs with links to extended material.  Getting to my
> point... one of the first things RSS authors did is write scripts to
> generate the RSS/XML code from source written in a simple
> structured-text format [2].  This is likely to be typical:  frequent
> users will create their own mini-language tailored to their tasks, and
> will use scripts to generate the XML from that source.
> 
> [1] <http://my.netscape.com/publish/help/mnn20/quickstart.html>
> [2] <http://my.userland.com/stories/storyReader$14>

I should note that the explosion of mini-languages is probably not a good 
thing.  I don't think it's a matter for language conversion, I think it's a 
case for intelligent tools.  I just read through the recent doc-sig archives 
and a relevant discussion has been taking place there.  What I conclude is 
that there will be no solution to meet all tastes, so this is the best we can 
hope for:

1) Perhaps a single structured, "normal" format for Python doc-strings that 
can be exported to XML.  Perhaps this would be some variant of John Day's RML 
markup.  Some people would ignore this facility, and I would be one of them.

2) A nice tool (perhaps Web-based) for generating [X|SG]ML docs from source 
code.  My idea is that it could read Python source and put up a form or 
interactively ask the programmer to fill in the fields corresponding to the 
parts that need to be documented.  This is the approach I would choose to use.

3) A DTD or schema for storing *ML docs for Python modules regardles of 
whether they were generated from method (1) or (2) above.  It would also be 
used by the direct-to-*ML folks using their favorite intelligent editor.

4) A tool that could map the *ML docs back to doc-strings and comments 
embedded within the Python code for amateurs of method (1)

5) A tool that could generate links from code doc-strings and comments to 
relevant parts of the separate *ML docs for people who prefer their docs 
separate but accessible (probably amaterus of method (2), myself included).

I don't think the above tools would be too hard to write.  FourThought has 
already made a small start on (2), (3) and (5).  And these tools would appear 
to satisfy all the preferences I've observed so far.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org