[Doc-SIG] Approaches to structuring module documentation

Fred L. Drake, Jr. fdrake@acm.org
Mon, 22 Nov 1999 13:37:08 -0500 (EST)


Paul Prescod writes:
 >  * content
 >  * structure of the content
 >  * presentation

  A reminder of what the axes really are is always nice!

 > Okay, they are all related but they are still different. If somebody
 > can't find something, I would tend to try to fix that in the
 > presentation first, and then in the content and finally in the structure
 > if all else fails. Let's not jump right to the structure. Consider this

 > Also, consider our choices a graph with two axes:
 > 
 >  * specificity of markup
 >  * granularity ("library", "package", "module", "class) of file
 > 
 > If you think of it that way, then you realize that you could have a very
 > generic microdocument architecture (one HTML class per symbol) or a very
 > specific (PyBook) but ungranular (the WHOLE book) DTD. And of course the
 > other two options are also availble.

  I'd describe the current markup as being highly specific, and I
think that makes authoring much easier in many ways (there's a limit
to what needs to be typed to mark something in an interesting way).
However, there are a bunch of marks that can be reasonably made, and
there are a few people out there who think documentation isn't
intrinsically interesting(!); this means they don't read the
documentation for the markup (which is incomplete anyway), and there's 
some resistence to having to type much to "mark" something.  This
leads me to think that less marking would be nice.

I said:
 >   This approach has the advantage of matching the current structure of
 > the documentation.  The conversion isn't terribly difficult or even
 > time consuming given the state of the things in Doc/tools/sgmlconv/ in
[...]

Paul Prescod writes:
 > I believe that this advantage strongly overwhelms any benefits of going
 > to a more theoretically pure markup. It's taken roughly a year to get
 > our documents clean enough that they can even move to XML or something

  I'm not convinced.  If what we end up with is little different from
what we have, I don't see why we need to convert at all.  There are
plenty of people who don't *like* LaTeX syntax, but those people won't 
be any happier with XML; I'd expect them to be less happy because
there's more characters in the syntax.  (On the other hand, the syntax 
is more clearly defined and involves fewer special characters, which
is one of the advantages Guido sees with XML or even a carefully
chosen SGML declaration.)

 > similar. How long would it take to completely reorganize them? You,
 > Fred, have a job that only partially includes documentation maintenance
 > and the rest of us are not nearly so interested in re-writing DOCS as we
 > are in re-writing CODE. I fear that a move to Microdocuments would never
 > happen.

 > That's a killer argument.

  That's been my biggest concern about it all.  When working with
this, I'm often in a quandry over how to get more detail out of the
documentation without ending up being the author of the whole ball of
wax.

 > I propose an incremental approach. Let's get to PyBook XML and THEN
 > re-evaluate PyBook in terms of microdocument. 

  Does an incremental approach really make sense?  I suspect we want
to avoid having to give module authors a new set of tools to do
(essentially) the same thing too often.  Regardless of the merits of
the new tools.  (Where "tools" can include things like markup
vocabularies and syntax.)  This is a problem because it leads to
increased resistence from potential authors.

 > Here's an important issue: Perl and Java have achieved a relatively high
 > level of module documentation conformity by putting the microdocument
 > *in the code*. This appeals strongly to basic human nature. One file

  After talking with Guido about these issues last week, I've been
looking into this more.  I've been discussing the benefits & failings
of POD with Greg Ward (of distutils fame), who was a Perl programmer
well before he learned Python.  Needless to say, he's a *huge* fan of
inline documentation (and lots of it).
  So I've been playing with a little tool to create documentation from 
a Python parse tree.  As with many things, it's been done before, but
with limited success (docco, gendoc, pythondoc).  I suspect the
success rate is probably tightly with it being declared "the right
way" by Guido.
  The script isn't near ready, but I'm aiming for being able to
generate documentation one module or one package at a time with at
least reasonable levels of internal linking among HTML files (other
formats can wait; I want a hypertext format first to make sure I get
the linking right).  Once I have this, I should be able to construct a 
system that allows the docs to be created using either some XML/SGML
language in a separate file or this POD-like/structured text inside
the source file.  Building a reference manual from those inputs would
be very similar to what we have now, and is more a matter of gluing
pieces together.
  Another advantage of using inline documentation in the sources is
that the source can be used as part of the markup; a lot of
information is already in the parse tree.  Using that information to
augment the explicit documentation may prove to be very valuable,
especially for people interested in including lots of specific details 
in the documentation.  The programmer should be able to declare that
this not be done, preferably at both global and local levels within a
package or module, since there are many situations in which the
specific structure of the code is downright misleading in terms
describing the public interface.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives