[Python-Dev] pydoc improvements
Noam Raphael
noamr at myrealbox.com
Sat Jun 12 20:48:11 EDT 2004
Hello,
I've made a few improvements to the pydoc module. I've tried to deliver
them to Ka-Ping Yee, but it seems that the e-mail wasn't transferred, so
here I present them to python-dev, including Ping. I hope they will be
reviewed and, if found to be appropriate, accepted.
Pretty-Printed Source Code
==========================
Many times, in order to understand exactly what a function does, I'm
interested in its source code. This is especially true for Python,
because of its readability. So I made pydoc produce, along with the HTML
documentation of a module, a highlighted source code of the module,
interlinked with its documentation.
I added to the HTMLDoc class a new method, sourcemodule (you may think
of a better name). It gets a filename and a few more parameters, and
returns the pretty-printed source code, in HTML format. Class and method
definitions are links to the description of them in the generated
module documentation. I also made class and function definitions in the
module documentation link to their definition in the source code.
I've changed the second link on the right-top of a module
documentation to point to the HTML source instead of the original file,
because I think usually people are interested in seeing the code in
their browser, not in downloading it. In case they are interesting in
downloading it, I added a link called "download <filename>" to the HTML
source, so people can click it and hopefully get the pure source code.
The file name of the HTML source code is "<module name>-source.html".
It is now generated automatically along with the "<module name>.html",
and is served by the HTTP server.
Public and Private
==================
Public attributes are defined by the current pydoc version as
attributes whose name doesn't start with an underscore, or, for module
attributes where __all__ is defined, attributes which appear in __all__.
Currently pydoc doesn't show private objects at all. This is usually
what people want, but sometimes, when they are interested in what the
module really does (for example when they want to define a subclass),
they may want to view the documentation for private objects too. So I
added a link to the module documentation header, which alternates
between "Show Private" and "Hide Private", and shows or hides private
attributes - both module attributes and class methods. To accomplish
this, I defined two alternative style sheets: one which shows private
attributes and one which doesn't. The "Show/Hide Private" link uses
javascript to switch between these two styles. (The style can also be
selected by a menu in the Mozilla browser.) I used a cookie to save the
user's prefered style - if I hadn't, the style would've returned to the
default every time a new documentation page is opened. The produces HTML
is standard-compliant, to my knowledge, and works with both Mozilla and
IE. The style-switching link doesn't work in konqueror, so one using it
won't be able to see private attributes - not too bad.
Many modules contain objects imported from other modules. Usually
these are for the module's own use, and the module's user isn't
interested in them. These are not even private - I don't think anyone
will want to see their documentation. In the current version of pydoc,
these are filtered by calling inspect.getmodule. This raises two
problems: the first is that inspect.getmodule can't deduce the module of
built-in objects, so the module documentation may be filled with such
"garbage" (it actually happened to me, before I've used my version of
pydoc). The second problem is that sometimes a module wants attributes
of other modules to be considered as its own attributes too (for
example, the tokenize module exports the constants of the token module).
My solution is this: attributes which are not public attributes of a
module, and are public attributes of another imported module, are never
displayed. They are probably the result of an internal "from module
import something" statement.
The patch attached is against version 1.91, because I implemented
using __all__ in a bit different way than version 1.92.
More flexible HTML documentation of a complete directory
========================================================
Here's my actual situation: There's a package maintainance system,
where you first create files in your home directory and then use the
system to install them in a common place. For example, if a package is
called foo, and includes the modules ya.py and chsia.py, the modules
will be named /home/noam/foo/lib/python/{ya,chsia}.py. I want to
automatically document them, so that the HTML files will sit in
/home/noam/foo/doc. After installing the package, the python modules
will be copied to /usr/local/lib/python/, and the documentation will be
copied to /usr/local/doc/foo.
I want to be able to type one command to generate all the
documentation, but I don't want to waste time if I only need the
documentation re-generated for one module. So I made "pydoc -w <dir>"
produce documentation only if the target file was modified before the
module was. (pydoc -w <filename> still write the documentation even if
it seems up to date.)
It is perfectly legitimate to write a HTML documentation for a module
by your own. So I made pydoc never overwrite HTML files which weren't
created by pydoc. To accomplish this, I added the tag <meta
name="generator" content="pydoc"> to the head of all HTML files produced
by pydoc. A warning is printed if a module was modified after a manually
created documentation was modified.
I wanted to be able to run "pydoc -w" from any directory, not just
from /home/noam/foo/doc, so I added an option "-o <dir>", to specify the
output directory. (If it isn't specified, the current directory is used.)
I wanted the documentation to refer to the installed file
(/usr/local/lib/python/foo.py), not to the file in my home directory. So
I added an option "-d <destdir>", to specify the directory in which the
files will reside after being installed.
To summarise, the command which I should now run to produce the
documentation is
pydoc -w -o doc/ -d /usr/local/ lib/python/
(running it from /home/noam/foo).
The -d and -o options are already available in compileall.py, for
exactly the same reasons, I think.
None of this should cause backward compatibility issues.
Module loading
==============
Currently, safeimport(), used by writedocs(), gets only a module name
and imports it using __import__(). This means that when you document a
complete directory, you can only produce HTML documentation for modules
which are already installed, and that a documentation for the wrong
version of a module may be produced. I changed safeimport() to get an
optional list of directories in which to look for the module/package, in
which case it uses imp.find_module and imp.load_module instead of
__import__. This means that when producing documentation for a complete
directory, the actual files in the directory are used, not installed
modules of the same name.
Misc.
=====
I changed the docmodule methods, to use cStringIO.StringIO.write
instead of s = s + something.
in do_GET, forceload was set to True for some reason. It caused
problems when documenting modules which pydoc itself uses, and I don't
see why should modules be forced to load, so I changed it to False.
The link to a file was "file:"+path. I changed it to "file://"+path,
because my browser (Mozilla) treated the path as a relative path.
Attached is a diff against version 1.91 of pydoc.py (see the section on
public and private).
I will be glad if after your review, at least some of these changes will
be accepted.
Have a happy day,
Noam Raphael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pydoc.py.diff
Type: text/x-patch
Size: 58687 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040613/8e42d480/pydoc.py-0001.bin
More information about the Python-Dev
mailing list