[Python-Dev] Pydoc Improvements / Rewrite

Fri Jan 5 04:54:23 CET 2007

Ka-Ping Yee wrote:
> Hi Ron and Laurent,
> 
> I welcome attempts to improve pydoc (especially since I don't have
> much time to work on improving it myself).  I definitely agree that
> moving to CSS is long overdue, though I would like some input on
> the style of the produced pages.

Additional input would be good.  The html output I used is nearly pure nested 
definition lists with CSS styling to set the fonts, borders, and indents.  It 
was A bit tricky in some places to keep it looking like the current pydoc pages. 
   My mental target was something that would both look good printed and also fit 
in with Pythons current web site design while not changing it too much.

Changing the CSS file to produce other output styled pages should not be that 
difficult.  A little experimenting would be good in order to find where 
additional style tags in the html code may be needed.

> It's probably a good idea to explain how pydoc got to be the way
> that it is.  The module boundary between inspect and pydoc is a
> pretty clear one, intended to isolate pydoc from future changes
> to Python's introspection features (such as attributes on internal
> types like frames and functions).
> 
> On the other hand, I've often seen the question of why pydoc does
> both text and HTML generation instead of generating some intermediate
> data structure from which both kinds of output are produced.  The
> answer is: I tried it.  The result turned out to be longer than
> I expected and needlessly more complicated than what we have now.
> It may be that a better job could have been done, but I think there
> is a rational basis for why it turned out that way.

Yes, I found it was a trade off from one type of complexity to another.  And I 
didn't like importing something that will probably go through more changes like 
xmltree.

> The Python objects themselves already are a data structure containing
> all of the information we need.  I discovered that translating this
> data structure into another data structure and then producing text
> or HTML was more work than simply producing text or HTML.  With CSS,
> the last step gets even easier and so the intermediate stage becomes
> even less necessary.  Also, the intermediate step required me to
> essentially invent an API, and I decided that I trusted the stability
> of Python's API more than that of some API I invented just for this.
 >
> This is not to say that text and HTML generation can't be separated;
> it's just a caution against attempting to overgeneralize by creating
> an intermediate format.  I'm glad you backed away from XML (or I'd
> have warned you that processing the XML would be a lot of extra work).
 >
> The inspect module was intended to pull out as much as possible of
> the extraction functionality that's shared by the text and HTML
> documentation generators.  But pydoc is still big.  At the time I was
> proposing pydoc for addition to the standard library, I didn't want
> to pollute the top-level module namespace with too many names, so I
> tried hard to minimize the number of modules.  And of course it has
> grown since then with bits of new functionality and support for new
> language features in Python.

And it will continue to grow as python does.  Hopefully we can make the process 
of supporting new language features easier.

> But now if a package is being considered, it makes sense to split
> out some of the pieces (as you have done), such as the web server,
> the search function, and the interactive interpreter help prompt.
> It may even enable pydoc to provide search from the interactive help
> prompt, which would be a nice feature!  

I think that could be done without too much trouble.  It only takes adding a new 
allcaps word "FIND <something>" or "SEARCH <something>", in addition to KEYWORDS 
and TOPICS.

> The package could contain
> several modules for ease of maintenance, while still providing a
> single, convenient command for running pydoc from the Unix prompt.

I was thinking of two convenient entry points. One for text and the interactive 
console and one for html, and the web browser interface.  pyhelp and pydoc 
respectfully.

There is also the possibility of splitting it into two much smaller packages, 
one for the command line and interactive help console.  No html stuff or server 
stuff here. This could be better controlled and maintained as it's used in 
pythons console. Another plus is it will be easier to maintain as well.  The 
other package (or module) would be an example of how to extend or build an 
application, an html formatter and help browser in this case, from the console 
help package.

Cheers,
    Ron