[Doc-SIG] Approaches to structuring module documentation
Fred L. Drake, Jr.
fdrake@acm.org
Fri, 12 Nov 1999 16:01:25 -0500 (EST)
--FBmNV/Tzqn
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit
Manuel Gutierrez Algaba writes:
> Is this the LaTeX one ? or the "traditional" XML ?
I would describe the current approach as document-centric.
"Document-oriented" is how I was referring to content which was
naturally organized in documents, as opposed to data-structure-like
constructions such as my sample module reference.
The actual syntax wasn't specific to any of the three definitions.
> > DOCUMENT-CENTRIC APPROACH: The human-read document is the primary
>
> Is this TeEncontreX'es ? Are "module reference material" the
> "\indexpython" things ?
No, by this I meant the entire section documenting the module.
> MICRODOCUMENT APPROACH: Multiple DTDs are used to encode
> document-level information and module reference material. Let's only
>
> What's this ?
I'm not sure what "this" refers to; the term "microdocument
approach"? I'll be more specific:
Using a microdocument approach would involve using at least 2 DTDs,
one for module references, and another for "everything else." Each
module reference would be a document instance all by itself (in the
SGML/XML sense), not just a file that's part of something larger (like
the current module sections; there's no meaningful way to process them
individually. To get something like the current Library Reference,
another document (with another DTD) would specify how to put it
together: put this module, then this one, and now that section of
prose; in the next chapter, put .... We could define separate DTDs to
document Python modules, C APIs, and more book- or article-like
sections. Another would be the "glue" that defines a "manual" or
"howto" document.
> <description of things very related to TeEncotreX, I think>
From your explanations and looking at TeEncotreX, I'd describe what
you're doing as "indexing": you're assigning terminology from a
controlled vocabulary to each entry in your document base, and using
that as a retrieval mechanism. I think this is orthagonal to what I'm
talking about. Regardless of a move toward a microdocument approach
or document-centric approach, good indexing is critical to make the
information accessible.
The way you're using it (with lots of small articles) makes it very
microdocument-flavored, aside from lumping all the documents in one
file.
> To put it short: "Lot of work coding _details_". Just a comment,
> python is **much** better than C++, for example, because you
> have no need to declare every type, every detail, even, you can
> have large parts of a python programm broken, parts that a C++
> compiler would mark as erroneous.
I agree. I think things like type annotations should be completely
optional in the documentation. However, I think there's a lot of
value in supporting annotations that say things like "this returns a
file-like object" that can be interpreted by programmer's tools (help
system in an IDE, pylint-style analyzers, etc.). So it should be
possible to add interesting annotations, so a programmer can ask a
tool, "What are all the ways I can get a file object?"
> > To really make it work, a lot of attention
> > would have to be applied to the result of the first-stage conversion
> > to check the accuracy of the results, make the various bits of text
> > actually land in the right place (since everything is pretty much
> > thrown together now), and encode a lot of additional information about
> > types, parameters, exceptions thrown, etc.
>
> More heavy work !
But, as you point out for TeEncontreX, it's linear to the volume of
information you have + what you want to get out of it.
> The biggest problem I see here is that you get a very good documentation
> ( due to the huge ammount of work) or you get nothing ( the author
> doesn't documentate).
We get the later one now! ;(
> It'd be wise to provide several levels of marking-up , so people
> can mark-up little by little, some important things first and so...
This is another good reason to make a lot of the markup optional; my
example probably did use "maximal" markup, but went a long way toward
it. Let's try adjusting the assumed DTD a little, and cut out a fair
bit of the markup (even if it's useful). The file is attached; here's
the word count:
weyr(.../Doc/lib); wc libmailbox.tex mailbox.xml mailbox-min.xml
53 251 1938 libmailbox.tex
159 504 5364 mailbox.xml
118 370 3936 mailbox-min.xml
Still large, but definately better. Good enough? I don't know.
I do expect that at least one tool will emerge that will take a
Python source file and spit out a skeleton documentation file that can
be filled in.
> This is the "TeEncontreX" version of Mailbox, this should
> work if you have AnalizaToo.py:
Cool; I'll run this through as soon as your package downloads again!
;-)
Aha! You didn't test this! ;-)
> Just some comments:
> - Thinking about it, I mentioned the need for an appropos utility
> one year ago, If you realise, this IS the apropos utility!!
Library science types would call this kind of data marking
"indexing".
Saludos, amigo! (Hey, I'm learning Spanish! Cool! ;)
-Fred
--
Fred L. Drake, Jr. <fdrake@acm.org>
Corporation for National Research Initiatives
--FBmNV/Tzqn
Content-Type: text/xml; charset=iso-8859-1
Content-Description: More minimal sample module reference.
Content-Disposition: inline;
filename="mailbox-min.xml"
Content-Transfer-Encoding: 7bit
<?xml version="1.0" encoding="iso-8859-1"?>
<module-reference>
<module-info>
<module>mailbox</module>
<synopsis>Read various mailbox formats.</synopsis>
</module-info>
<overview>
<para>This module defines a number of classes that allow easy and
uniform access to mail messages in a mailbox. Most of the
supported mailbox formats come from the Unix world.</para>
<para>None of the classes defined in this module lock the
mailboxes that are accessed; this needs to be handled by
application code.</para>
</overview>
<protocoldesc>
<protocol>Mailbox</protocol>
<method name="next">
<return-value>
A message object, or <constant>None</constant> if there
aren't any more message in the mailbox.
</return-value>
</method>
</protocoldesc>
<classdesc>
<class>UnixMailbox</class>
<protocol>Mailbox</protocol>
<description>
Access a classic Unix-style mailbox, where all messages are
contained in a single file and separated by <quote>From name
time</quote> lines.
</description>
<constructor>
<parameter name="fp" protocol="file"/>
<description>
<para>Initialize the mailbox object and point to the first
message in the mailbox.</para>
</description>
</constructor>
</classdesc>
<classdesc>
<class>MmdfMailbox</class>
<protocol>Mailbox</protocol>
<description>
<para>Access an <acronym>MMDF</acronym>-style mailbox, where all
messages are contained in a single file and separated by lines
consisting of four control-A characters.</para>
</description>
<constructor>
<parameter name="fp" protocol="file"/>
<description>
<para>Initialize the mailbox object and point to the first
message in the mailbox.</para>
</description>
</constructor>
</classdesc>
<classdesc>
<class>MHMailbox</class>
<protocol>Mailbox</protocol>
<description>
<para>Access an <acronym>MH</acronym> mailbox, a directory with
each message in a separate file with a numeric name. Messages
that are added to the mailbox after the instance is created
are not accessible; a new instance is needed to access newly
added messages.</para>
</description>
<constructor>
<parameter name="dirname" type="string"/>
<description>
<para>Initialize the list of messages that can be loaded from
the mailbox.</para>
</description>
</constructor>
</classdesc>
<classdesc>
<class>Maildir</class>
<protocol>Mailbox</protocol>
<description>
<para>Access a Qmail mail directory. All new and current mail
for the mailbox is made available. Messages that are added to
the mailbox after the instance is created are not accessible;
a new instance is needed to access newly added messages.
</para>
</description>
<constructor>
<parameter name="dirname" type="string"/>
<description>
<para>The <param>dirname</param> parameter points to the
mailbox directory.</para>
</description>
</constructor>
</classdesc>
<classdesc>
<class>BabylMailbox</class>
<protocol>Mailbox</protocol>
<description>
<para>Access a Babyl mailbox, which is similar to an
<acronym>MMDF</acronym> mailbox. Mail messages start with a
line containing only <literal>'*** EOOH ***'</literal> and end
with a line containing only <literal>'\037\014'</literal>.
</para>
</description>
<constructor>
<parameter name="fp" protocol="file"/>
<description>
<para>Initialize the mailbox object and point to the first
message in the mailbox.</para>
</description>
</constructor>
</classdesc>
</module-reference>
--FBmNV/Tzqn--