[hi doc-sig folks. This is a conversation that has been brewing in xml-sig, but I think you may find it relevant]
I am trying to learn XML by developing my own tools. This has introduced me to some of the more subtle aspects of XML and has caused me to revise my opinion of what XML is. (Actually, I think my XML 'evangelists' don't really know exactly what XML is and are abusing it by proposing it for object databases and other somewhat mis-appropriate uses). I see XML strictly as a way of marking up a 'document' to expose its structure and semantics. Sure, documents are trees, like databases, but doesn't necessarily imply XML is a good way to implement a database. (Basic necessities such as query languages don't exist, yet).
In my experience it's really awkward to author material using XML/SGML-based notations. And even with the best XML-authoring tool that I can conceive, there's still the problem of reading and revising the content later. To me, even the best of XML documents are just plain *ugly*. XML makes sense as a data interchange format, but *not* as an authoring format or as a database format.
It isn't ideal, but I've got used to typing in XML. Maybe that's why I don't have so many of the typical first-glance reactions anymore, and I don't even remember having had them. When I'm hacking at XML, I'm usually thinking in that mode, and I make liberal use of EMACS short-cuts to get things done switfly. Of course this will never do for most normal people, and that us why no-one here has been advocating that Python programmers type their docs in straight XML. All I was pursuing was an XML format for the eventual product, and user-friendly tools such as you describe for RSS would come along easily enough.
In the case of software documentation, XML could play a role in providing a basis for a family of DTDs used in sharing useful information about programs, such as consolidating reference material from multiple sources (and source languages) into a common catalog or repository. The actual documentation source would be designed to be convenient to the authors and maintainers of the programs, and translated to the common XML DTDs as necessary.
One related example is the "RSS" format being promulgated by UserLand.com and Netscape [1]. It's an XML DTD for publishing short news items and weblogs with links to extended material. Getting to my point... one of the first things RSS authors did is write scripts to generate the RSS/XML code from source written in a simple structured-text format [2]. This is likely to be typical: frequent users will create their own mini-language tailored to their tasks, and will use scripts to generate the XML from that source.
[1] <http://my.netscape.com/publish/help/mnn20/quickstart.html> [2] <http://my.userland.com/stories/storyReader$14>
I should note that the explosion of mini-languages is probably not a good thing. I don't think it's a matter for language conversion, I think it's a case for intelligent tools. I just read through the recent doc-sig archives and a relevant discussion has been taking place there. What I conclude is that there will be no solution to meet all tastes, so this is the best we can hope for: 1) Perhaps a single structured, "normal" format for Python doc-strings that can be exported to XML. Perhaps this would be some variant of John Day's RML markup. Some people would ignore this facility, and I would be one of them. 2) A nice tool (perhaps Web-based) for generating [X|SG]ML docs from source code. My idea is that it could read Python source and put up a form or interactively ask the programmer to fill in the fields corresponding to the parts that need to be documented. This is the approach I would choose to use. 3) A DTD or schema for storing *ML docs for Python modules regardles of whether they were generated from method (1) or (2) above. It would also be used by the direct-to-*ML folks using their favorite intelligent editor. 4) A tool that could map the *ML docs back to doc-strings and comments embedded within the Python code for amateurs of method (1) 5) A tool that could generate links from code doc-strings and comments to relevant parts of the separate *ML docs for people who prefer their docs separate but accessible (probably amaterus of method (2), myself included). I don't think the above tools would be too hard to write. FourThought has already made a small start on (2), (3) and (5). And these tools would appear to satisfy all the preferences I've observed so far. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org
participants (1)
-
uche.ogbuji@fourthought.com