[Doc-SIG] Approaches to structuring module documentation

Fred L. Drake, Jr. fdrake@acm.org
Tue, 23 Nov 1999 16:03:33 -0500 (EST)

Manuel Gutierrez Algaba writes:
 > Well, If that's we want to, why don't we just do it? We have 
 > only to MARK that small pieces of info. Fred calls this indexing.

  Please don't think I'm the only one!  I just applied the name to the 
situation.  Indexing is a good thing.

 > Many people, me included, won't accept a method of doc that requires
 > a sintax or the loss of freedom when doing things ( this includes
 > docstrings ). But it's acceptable to write python code, and then 
 > put:
 > "\indexexamplelambda" or 
 > "\indexsocket"
 > because that marking would be useful even for the programmer himself,
 > even he eventually got accostumed to doc things just writting indexes and
 > ... but that'd be the second and third step. Anyway, most people
 > won't be angry against such a simple measure. 

  The "simple" part of this isn't the problem, though it does turn
into one.  This approach hinges on what's called a "controlled
vocabulary" (another one of those good Information Science words!).
Without some agreement on the terms that enter the index, many related 
things are not similarly indexed.  Achieving consistency in the case
of many author/indexers is very difficult without either a well-
defined controlled vacobulary or strong editorial oversight.  The
later is (by far) easier to implement, and is what I've tried to
provide for the standard documentation.  (Compare this to a controlled 
vocabulary approach; think "Library of Congress Subject Headings," or
other large cataloging systems used in libraries.  Ever wonder why
programming language books appear in at least a couple of different
places in the computer science section of a good university library?)
Editorial control is tedious and can become difficult; but controlled
vocabularies are the child of committees!  (Which doesn't mean they're 
not useful, just that there's an *enormous* overhead to using them.)

 > If Fred thinks that providing a "smart" way of showing how many
 > modules, functions and expections are in a piece of python code is 
 > enough,
 > I think that with that we don't go much further.

  Actually, I don't think that's enough, or that it solves that
particular problem.
  The purpose of extracting information from the Python sources is not 
so much as to provide new information (though it may) as to ease the
burden on those authoring documentation.  (Which does *not* mean me!)
I'd like to see newly released modules from independent developers be
documented in a consistent way; making this easy is a necessity for it 
to happen.
  There are still several things which have to be done, including
index building.  One of the catches of index building is that building 
a really useful index (not just a comprehensive one) is fundamentally
a hard thing to do.  I recently spoke to someone who once managed half 
of the indexing team at the Encyclopedia Britanica about this, and
find that it's not at all clear what actually needs to be done to
improve the situation.  A *large* index, especially when presented
"book style," is not particularly desirable.

 > No traditional parser-driven or XML-driven or javadoc-driven approach
 > will bear the richness, complexity, diversity of origin/source,
 >  state,... of so many info. 

  I agree:  No automatic method will replace good human indexing.

 > BTW: Documenting the python library would be the minor  and less
 > interesting thing by far. 

  I think the current library documentation is actually pretty good;
I'm interested in improving both the content and accessibility (via
indexing or any other approach).  The Doc-SIG has long had the mandate 
of moving Python out of the LaTeX prehistoric period into the 21st
century, however, which is one of motivations for the work done to
move from LaTeX to SGML/XML/whatever-comes-next.  I know I've been
beating on Guido about this for 4 1/2 years!

 > We have a Golden Chance here, we can have the best info system
 > of the Internet. Don't try to be The Big Hero, the Big guy who finds 
 > the smart solution. Here the Only One Hero is Information, the 
 > Information that should be at last available !

  Perhaps we need another defined task for the SIG:  locate all the
resources that should be part of this all-encompassing Python
Documentation Web?  That's no small task!  Perhaps you'd like to start 
of list of the documents you think should be included in the indexing
effort, including current links to them.  A Web page that simply lists 
them would be a good start.

 > Death or victory ! 

  Don't do that!  While alive you can work to improve things, once
dead... well, *I've* never met anyone who came back.


Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives