[Python-Dev] Pydoc Improvements / Rewrite

Talin talin at acm.org
Fri Jan 5 08:49:05 CET 2007


Larry Hastings wrote:
> Ron Adam wrote:
>> Thanks for the link. PEP 287 looks to be fairly general in that it 
>> expresses a general desire rather than a specification.
> I thought it was pretty specific.  I'd summarize PEP 287 by quoting 
> entry #1 from its "goals of this PEP" section:
> 
>    * To establish reStructuredText as a standard structured plaintext
>      format for docstrings (inline documentation of Python modules and
>      packages), PEPs, README-type files and other standalone documents.
> 
> 
> Talin wrote:
>> Rather than fixing on a standard markup, I would like to see support 
>> for a __markup__ module variable which specifies the specific markup 
>> language that is used in that module. Doc processors could inspect 
>> that variable and then load the appropriate markup translator.
> I guess I'll go for the whole-hog +1.0 here.  I was going to say +0.8, 
> citing "There should be one---and preferably only one---obvious way to 
> do it.".  But I can see organizations desiring something besides ReST, 
> like if they already had already invested in their own internal 
> standardized markup language and wanted to use that.
> 
> This makes the future clear; the default __markup__ in 2.6 would be 
> "plain", so that all the existing docstrings work unmodified. At which 
> point PEP 287 becomes "write a ReST driver for the new pydoc".  
> Continuing my dreaming here, Python 3000 flips the switch so that the 
> default __markup__ is "ReST", and the docstrings that ship with Python 
> are touched up to match---or set explicitly to "plain" if some strange 
> necessity required it.
> 
> (And when do you unveil DocLobster?)

Well, I'd be more interested in working on it once there's something to 
plug it into - I didn't really want to write a whole pydoc replacement, 
just a markup transformer.

One issue that needs to be worked out, however, is the division of 
responsibility between markup processor and output formatter. Does a 
__markup__ plugin do both jobs, or does it just do parsing, and leave 
the formatting of output to the appropriate HTML / text output module? 
How does the HTML output module know how to handle non-standard metadata?

Let me give an example: Suppose you have a simple markup language that 
has various section tags, such as "Author", "See Also", etc.:

    """
    Description:
       A long description of this thing whatever it is.

    Parameters:
       fparam - the first parameter
       sparam - the second parameter

    Raises:
       ArgumentError - when invalid arguments are passed.

    Author: Someone

    See Also:
       PyDoc
       ReST
    """

So the parser understands these various section headings - how does it 
tell the HTML output module that 'Author' is a section heading? 
Moreover, in the case of "Parameters" and "Exceptions", the content of 
the section is parsed as a table (parameter, description) which is 
stored as a list of tuples, whereas the content of the "Description" 
section is just a long string.

I guess the markup processor has to deliver some kind of DOM tree, which 
can be rendered either into text or into HTML. CSS can take over from 
that point on.

-- Talin


More information about the Python-Dev mailing list