I just released version 1.1 of epydoc. Epydoc is a tool for generating API documentation for Python modules, based on their docstrings. <http://epydoc.sourceforge.net/> A lightweight markup language called epytext can be used to format docstrings, and to add information about specific fields, such as parameters and instance variables. For some examples of the documentation generated by epydoc, see: - The API documentation for epydoc. <http://epydoc.sourceforge.net/api/> - The API documentation for the Python 2.2 standard library. <http://epydoc.sourceforge.net/stdlib/> - The API documentation for NLTK, the natural langauge toolkit. <http://nltk.sourceforge.net/ref/> New features added since 1.0 include: - A frames-based table of contents - Documentation for builtin objects - Documentation for types - Improved navigation bars - Improved warning messages - Better documentation for variables - An identifier index - An improved graphical interface - Man pages A complete list of new features and the change log are available at: <http://sourceforge.net/project/shownotes.php?release_id=119576> -Edward
I just released version 1.1 of epydoc. Epydoc is a tool for generating API documentation for Python modules, based on their docstrings.
Would you mind comparing epydoc to the standard pydoc.py? --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Would you mind comparing epydoc to the standard pydoc.py?
I compared epydoc to a number of other projects (pydoc is #1) at: <http://epydoc.sourceforge.net/relatedprojects.html> But the short answer is that I see 3 main differences between epydoc and pydoc: 1. Epydoc produces html output that looks more professional and is easier to read and navigate. 2. Epydoc supports the epytext markup language, which can be used to format docstrings, and to add information about specific fields, such as parameters and instance variables. (But note that use of epytext is not required.) 3. Pydoc provides excellent command-line (man-page style) and interpreter (pydoc.help) interfaces. Pydoc is a great tool, and I would *not* advocate replacing it with epydoc. (Although I certainly wouldn't object to epydoc getting added to the standard library, if enough people like it. :) Alternatively, I wouldn't object to pydoc and epydoc getting merged, but it would be quite an undertaking, because they're fairly different under the hood.) To be more specific, these are some of the important differences I see between epydoc and pydoc: - The output produced by epydoc is easier to read and navigate (in my opinion, anyway). - Epydoc produces a frames-based table of contents. - Epydoc provides a "show/hide private" toggle button. - Epydoc creates a "trees" page with class & module hierarchies. - Epydoc Creates an "index" page with term & identifier indices. - Epydoc includes a "help" page. - Epydoc includes a "breadcrumbs list" in the navigation bar, with pointers to the containing classes/modules/packages. - The navigation bar includes links to the "top" page and to the project's homepage. - Epydoc documents each class on its own page. - Epydoc uses external css stylesheets to allow for more customizable output. - Functions, methods, and variables are described with a shorter summary table and a longer details list. - Epydoc parses builtin function signatures. - Variable details includes variable type, optional description, and colorized value. - Lists of known subclasses, base class trees, etc. - Classes are divided into normal classes and exceptions. - Pydoc's layout wastes a lot of horizontal space. - Epydoc supports use of the epytext markup language. - Epytext can be used to document parameters, variables, etc. - Inline markup can be used to mark italics, bold, monospace, documentation links, URLs, index terms, etc. - Epytext can be used to create ists, sections, and literal blocks. - Epytext colorizes doctest blocks. - Epydoc can be used to check documentation completeness. - Epydoc has a graphical interface (e.g. for windows users). - Epydoc is fairly robust (e.g., it can document Zope 3). I haven't actually tested the robustness of pydoc, though. - Epydoc inherits documentation for undocumented methods whose signatures match the base class method. (This can be disabled by adding a blank docstring to the undocumented method). - Some advantages of pydoc are: - It provides links to the source code for each module. - It can be used from the command-line to view manpage-like docs. - It can be used from within python (pydoc.help) - It automatically creates intra-documentation links (you might see this as a positive or a negative, since it sometimes creates links where there shouldn't be links; epydoc is more conservative, and will only create links if you tell it to (with epytext markup). - It currently has better support for python 2.2-style types (with wrapper_descriptors, etc.). - It does some processing of comments. Epydoc just uses docstrings. I think that the best way to see the differences between their output is to navigate around the docs produced by each tool for the same code. The docs for the Python 2.2 standard library, from each tool, are at: pydoc: <http://web.pydoc.org/2.2/> epydoc: <http://epydoc.sourceforge.net/stdlib/> Also, you might take a look at some of the docstrings written using epytext (e.g., see the source code for epydoc itself), and the documentation produced by those docstrings; or you could just look at "A Brief Introduction to Epytext" for a quick example: <http://epydoc.sourceforge.net/epytextintro.html> -Edward
Edward Loper wrote:
I think that the best way to see the differences between their output is to navigate around the docs produced by each tool for the same code. The docs for the Python 2.2 standard library, from each tool, are at:
pydoc: <http://web.pydoc.org/2.2/> epydoc: <http://epydoc.sourceforge.net/stdlib/>
Nice work, Edward! The output from epydoc is very beautiful. The pages produced by pydoc could be improved, though i think the major differences between them come from a difference in design intent. Put another way, epydoc and pydoc try to satisfy different constraints. Here are pydoc's constraints so you can see what it was trying to achieve: (a) It tries to stick to "one module -> one file". (b) It tries not to present the same information twice. It tries to minimize dependencies... (c) on auxiliary files (d) on browsers (e) on code formatting You may or may not agree that these constraints were good choices; perhaps they seem extreme to you. (There's a general philosophy of minimalism at work here: i wanted the viewer to be able to see a lot as quickly as possible. It's also nice to be able to update the doc file for a single module when you edit it, without having to regenerate everything.) epydoc relaxes some of these constraints, and capitalizes on them to provide more functionality. So, to revisit Edward's list of differences:
1. Epydoc produces a frames-based table of contents. 2. Epydoc provides a "show/hide private" toggle button. 3. Epydoc creates a "trees" page with class & module hierarchies. 4. Epydoc Creates an "index" page with term & identifier indices. 5. Epydoc includes a "help" page. 6. Epydoc includes a "breadcrumbs list" in the navigation bar, with pointers to the containing classes/modules/packages. 7. The navigation bar includes links to the "top" page and to the project's homepage. 8. Epydoc documents each class on its own page. 9. Epydoc uses external css stylesheets to allow for more customizable output. 10. Functions, methods, and variables are described with a shorter summary table and a longer details list. 11. Epydoc parses builtin function signatures. 12. Variable details includes variable type, optional description, and colorized value. 13. Lists of known subclasses, base class trees, etc. 14. Classes are divided into normal classes and exceptions. 15. Pydoc's layout wastes a lot of horizontal space.
[...]
16. Epydoc supports use of the epytext markup language.
Some of these can be explained in terms of the differing constraints; others are just deficiencies (missing features in pydoc). Why doesn't pydoc do... 1? Because of (d). 2? Because of (a). 3? Instead of one big page of trees, pydoc has little class trees on each module's page. 4? Missing feature. 5? Missing feature / didn't think it would be necessary. 6? pydoc does do this (breadcrumb links are in the header bar). 7? pydoc does do this (index link is in the header bar). 8? Because of (a). 9? Because of (c). 10? Because of (b). 11? Missing feature / didn't know there was an established convention. 12? (e): didn't want to impose a standard for describing variables. In most cases the type is redundant (the type is evident from the repr) and pydoc tends to be minimal about the use of space. 13? Missing feature. 14? Missing feature. 15? The bars on the left were intentionally placed there to provide context (as you scroll down a long page, it may not be visible what section you're in). You could say they're too fat though. 16? Because of (e). -- ?!ng
Ka-Ping Yee wrote:
Here are pydoc's constraints so you can see what it was trying to achieve:
(a) It tries to stick to "one module -> one file". (b) It tries not to present the same information twice. It tries to minimize dependencies... (c) on auxiliary files (d) on browsers (e) on code formatting
Just for comparison, some of epydoc's constraints were: - pretty/easy-to-navigate html output. - support for documenting "fields" (parameters, variables, etc). - a markup language that is very simple and clean, and has no hidden "gotcha" cases. - a markup language that is powerful enough for most people's needs when writing API docs. (Well, at least for *my* needs :) ) - robustness. - minimized dependencies on browsers (note that epydoc output looks quite good under text browsers like links, and old versions of netscape/ie). - maximized information density (though perhaps not as strongly maximized as it is for pydoc).
You may or may not agree that these constraints were good choices; perhaps they seem extreme to you. (There's a general philosophy of minimalism at work here: i wanted the viewer to be able to see a lot as quickly as possible.
I can appreciate minimalism, and I stand by my statement that pydoc is a great tool that fills a very useful niche. I use it for its manpage-style output all the time.
1. Epydoc produces a frames-based table of contents.
1? Because of (d). [d=no dependency on browsers]
But note that the use of frames is totally optional for the viewer.
3. Epydoc creates a "trees" page with class & module hierarchies.
3? Instead of one big page of trees, pydoc has little class trees on each module's page.
This makes it harder to see how classes defined in different modules relate to each other.
2. Epydoc provides a "show/hide private" toggle button.
2? Because of (a). [a=one module/file]
8. Epydoc documents each class on its own page.
8? Because of (a). [a=one module/file]
What's the reasoning behind the one module/file criteria? I decided to put each class and method on its own page, because they seemed to be about the right sized conceptual "chunk." Also, this means that the "nesting" of objects on any given page is just 1-deep (modules->vars, modules->classes, classes->methods, classes->vars, etc.), whereas one module/file gives 2-deep nesting (modules->classes->methods, etc).
9. Epydoc uses external css stylesheets to allow for more customizable output.
9? Because of (c). [c=no dependance on auxilliary files]
The stylesheet can be safely ignored, and the pages still come out looking pretty nice. Is the reasoning behind this that you want to be able to grab a single html file by itself, and copy it somewhere? This suggests that one difference between pydoc and epydoc is that I think of the set of docs created by epydoc as a single coherent whole (that shouldn't every really be split up), whereas it seems like you think of the docs created by pydoc as a set of related but independant files.
10. Functions, methods, and variables are described with a shorter summary table and a longer details list.
10? Because of (b). [b=no repetition of information]
That seems pretty reasonable, but if the docstrings get long, it can make it hard to scan through and quickly see what a module/class provides.
11. Epydoc parses builtin function signatures.
11? Missing feature / didn't know there was an established convention.
I seem to remember seeing a convention written in the python style guide somewhere that builtin functions should start with a 1-line signature (since the signature can't be divined via inspection). This convention is certainly followed by __builtin__, sys, os, os.path, etc. Feel free to rip out my algorithm and adapt it to your own code. It's in epydoc.objdoc.FuncDoc._init_builtin_signature, on line 1313 of epydoc/objdoc.py. It currently handles just about everything except for "zip" which I argue doesn't quite follow the normal conventions: zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)] (I think my algorithm would recognize it with another comma either before or after the inner "[".) On the subject of ripping code from epydoc, there's other code that you might want to rip for inspect.py. E.g., see epydoc.uid._find_builtin_obj_module and epydoc.uid._find_function_module, which are more robust than the corresponding functionality provided by inspect.py.
12. Variable details includes variable type, optional description, and colorized value.
12? (e): didn't want to impose a standard for describing variables. [e=minimize dependencies on code formatting]
I think this falls under the category of things you can do with fields, if you want to allow fields (which pydoc doesn't, for reasonable reasons).
In most cases the type is redundant (the type is evident from the repr)
I think that type info can be pretty useful in some cases -- it's not always apparent from the repr. Also, this lets me provide a link to the type, when it's a class.
and pydoc tends to be minimal about the use of space.
I find epydoc's representation of variables much easier to read (multiline strings, colorized regexps, etc), but there's certainly no question that pydoc's representation is more compact. :)
15. Pydoc's layout wastes a lot of horizontal space.
15? The bars on the left were intentionally placed there to provide context (as you scroll down a long page, it may not be visible what section you're in). You could say they're too fat though.
Yeah, I think they're too fat. And when viewing the docs in text browsers, they're just dead space.
16. Epydoc supports use of the epytext markup language.
16? Because of (e). [=no dependancy on code formatting]
I think that this is a significant difference in goals for the two projects. But as I said on my related projects page, I think this may be one of the reasons that pydoc was able to become widely accepted. Of course, epydoc will treat all docstrings as plaintext if you tell it to. (Well, to be precise, if you use "--docformat plaintext", then the format for docstrings will default to plaintext, unless overridden on a per-module basis by the __docformat__ variable.) -Edward
On Thu, 31 Oct 2002, Edward Loper wrote:
- a markup language that is very simple and clean, and has no hidden "gotcha" cases.
I like the simplicity of epytext.
1. Epydoc produces a frames-based table of contents.
1? Because of (d). [d=no dependency on browsers]
But note that the use of frames is totally optional for the viewer.
Oh, i didn't realize that. Well done.
What's the reasoning behind the one module/file criteria? I decided to put each class and method on its own page, because they seemed to be about the right sized conceptual "chunk."
I guess it just made sense to me at the time not to have too many files. Navigating with the scroll bar is faster than loading a new page. It seemed convenient to have module-level functions and small utility classes kept together with the classes that use them. But i see good arguments both ways; in the end it's just a judgement call.
The stylesheet can be safely ignored, and the pages still come out looking pretty nice. Is the reasoning behind this that you want to be able to grab a single html file by itself, and copy it somewhere? This suggests that one difference between pydoc and epydoc is that I think of the set of docs created by epydoc as a single coherent whole (that shouldn't every really be split up), whereas it seems like you think of the docs created by pydoc as a set of related but independant files.
Yeah, exactly. I didn't want to deal with tracking dependencies among the files to figure out what to update when a module was changed, and it seemed wasteful to redo everything. If i were to write pydoc today, i'd probably use a stylesheet, though. CSS support has improved a lot.
10. Functions, methods, and variables are described with a shorter summary table and a longer details list.
10? Because of (b). [b=no repetition of information]
That seems pretty reasonable, but if the docstrings get long, it can make it hard to scan through and quickly see what a module/class provides.
Yes, the summary tables are quite nice. -- ?!ng
Edward Loper wrote:
- Some advantages of pydoc are: - It provides links to the source code for each module. - It can be used from the command-line to view manpage-like docs. - It can be used from within python (pydoc.help) - It automatically creates intra-documentation links (you might see this as a positive or a negative, since it sometimes creates links where there shouldn't be links; epydoc is more conservative, and will only create links if you tell it to (with epytext markup). - It currently has better support for python 2.2-style types (with wrapper_descriptors, etc.). - It does some processing of comments. Epydoc just uses docstrings.
I like the output of epydoc a lot (except maybe for the dim colors ;-). Wouldn't it be possible to add most of the above in form of options to epydoc ? What I don't understand about epydoc is why it uses a syntax that's almost JavaDoc-style, but not all the way ? Think of it this way: Java programmers are usually very aware of JavaDoc style comments, so switching to epydoc for Python programming would probably cause them more trouble due to the subtle differences than someone who has never worked in this context before. Anyway, just a suggestion. Is the doc-string parser pluggable ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
M.-A. Lemburg wrote:
Edward Loper wrote:
- Some advantages of pydoc are: [...]
I like the output of epydoc a lot (except maybe for the dim colors ;-). Wouldn't it be possible to add most of the above in form of options to epydoc ?
For some of these, they would be added (as defaults, not options) if I had time to code them. And there's plenty more on the epydoc todo list (see the comment at the bottom of epydoc.py/__init__.py). Others don't really go with epydoc's design philosophy. In particular, I doubt epydoc will ever automatically (implicitly) create intra-doc links. This can sometimes make mistakes, and puts links all over the place. I would rather have the user explicitly create links. I'm also unlikely to add support for processing python comments. And I doubt I'll add manpage-style and interpreter (pydoc.help) usage, because pydoc already does such a good job at it, and is already part of the standard library.
What I don't understand about epydoc is why it uses a syntax that's almost JavaDoc-style, but not all the way ?
Actually, the only real similarity between epytext and javadoc comments is that the @field's look roughly similar. E.g., note that you have to use explicit <p>'s in javadoc to mark paragraph boundaries; and you have to explicitly use <ul><li></ul> for lists, etc. I find javadoc's markup conceptually ugly. The idea of allowing unrestricted html code in your docstring really bothers me. And it makes the docstrings very difficult to read when you're looking at the source code. That said, it might be good to add support for javadoc-style docstrings, just because it would reduce the learning curve for java programmers. It wouldn't be that technically difficult to do; javadoc docstrings are basically just raw html plus @field's. And epydoc's docstring processing is pretty compartmentalized. But I only have limited time to spend on epydoc, and that's not a feature that I feel very motivated to add. If someone else wants to add it, I'd certainly accept a patch. What would probably be involved is: - Write epydoc/javadoc.html to parse javadoc-style comments. It would probably produce an xml document with a <javadoc> node that contains a <rawhtml> node followed by a <fieldlist> node similar to epytext's. Of course, if you wanted to handle javadoc's syntax for intradocumentation links, etc, you would need to do a little more work. - Patch ObjDoc.__parse_docstring in epydoc/objdoc.py to recognize 'javadoc' as a value for __docformat__. - Patch HTML_Formatter._dom_to_html_helper in epydoc/html.py to handle <rawhtml> elements. - (Optionally) add all of the field's that javadoc implements that epydoc does not (e.g., @since and @depreciated). Then you could just use "--docformat javadoc" to set the default docstring format to javadoc, or add "__docformat__='javadoc'" to each module that uses javadoc-style docstrings.
Think of it this way: Java programmers are usually very aware of JavaDoc style comments, so switching to epydoc for Python programming would probably cause them more trouble due to the subtle differences than someone who has never worked in this context before.
I agree that this would reduce the learning curve for java programmers. And it might help make things more consistant for API docs of jython programs. But as I said, I think that javadoc comments are ugly. :) -Edward
Edward Loper wrote:
M.-A. Lemburg wrote:
Edward Loper wrote:
- Some advantages of pydoc are:
[...]
I like the output of epydoc a lot (except maybe for the dim colors ;-). Wouldn't it be possible to add most of the above in form of options to epydoc ?
For some of these, they would be added (as defaults, not options) if I had time to code them. And there's plenty more on the epydoc todo list (see the comment at the bottom of epydoc.py/__init__.py).
Others don't really go with epydoc's design philosophy. In particular, I doubt epydoc will ever automatically (implicitly) create intra-doc links. This can sometimes make mistakes, and puts links all over the place. I would rather have the user explicitly create links. I'm also unlikely to add support for processing python comments. And I doubt I'll add manpage-style and interpreter (pydoc.help) usage, because pydoc already does such a good job at it, and is already part of the standard library.
Hmm, that doesn't leave much ;-)
What I don't understand about epydoc is why it uses a syntax that's almost JavaDoc-style, but not all the way ?
Actually, the only real similarity between epytext and javadoc comments is that the @field's look roughly similar.
That's what I was looking at. Your @field defs look very similar, but aren't compatible, e.g. was there a reason to add colons ? (this is really what I'm interested in; not the HTML formatting used in JavaScript)
E.g., note that you have to use explicit <p>'s in javadoc to mark paragraph boundaries; and you have to explicitly use <ul><li></ul> for lists, etc.
I find javadoc's markup conceptually ugly. The idea of allowing unrestricted html code in your docstring really bothers me. And it makes the docstrings very difficult to read when you're looking at the source code. That said, it might be good to add support for javadoc-style docstrings, just because it would reduce the learning curve for java programmers. It wouldn't be that technically difficult to do; javadoc docstrings are basically just raw html plus @field's. And epydoc's docstring processing is pretty compartmentalized. But I only have limited time to spend on epydoc, and that's not a feature that I feel very motivated to add.
If someone else wants to add it, I'd certainly accept a patch. What would probably be involved is:
- Write epydoc/javadoc.html to parse javadoc-style comments. It would probably produce an xml document with a <javadoc> node that contains a <rawhtml> node followed by a <fieldlist> node similar to epytext's. Of course, if you wanted to handle javadoc's syntax for intradocumentation links, etc, you would need to do a little more work. - Patch ObjDoc.__parse_docstring in epydoc/objdoc.py to recognize 'javadoc' as a value for __docformat__. - Patch HTML_Formatter._dom_to_html_helper in epydoc/html.py to handle <rawhtml> elements. - (Optionally) add all of the field's that javadoc implements that epydoc does not (e.g., @since and @depreciated).
Then you could just use "--docformat javadoc" to set the default docstring format to javadoc, or add "__docformat__='javadoc'" to each module that uses javadoc-style docstrings.
Thanks for the instructions. I think I'll have a go once we're sure that we need this. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
On Fri, 1 Nov 2002, M.-A. Lemburg wrote:
Actually, the only real similarity between epytext and javadoc comments is that the @field's look roughly similar.
That's what I was looking at. Your @field defs look very similar, but aren't compatible, e.g. was there a reason to add colons ? (this is really what I'm interested in; not the HTML formatting used in JavaScript)
For those who are not familiar with javadoc, their field syntax is: @param x description... @author description... Whereas my field syntax is @param x: description... @author: description... The problem I have with javadoc's syntax is that the markup language itself has to specify which fields take arguments and which don't. That goes against epytext's philosophy, where everything should be as simple and consistant as possible. By adding a colon, epytext doesn't need to know anything about what the set of fields are; and the later systems that actually use the fields can complain if they're invalid in some way. Another reason for using the colon is that strings of the form "@...:" are much less likely to occur "naturally" than strings of the form "@...". (Another difference is that using a colon would let you provide optional arguments, but none of the fields I have defined use optional arguments, and I doubt that any will.)
Thanks for the instructions. I think I'll have a go once we're sure that we need this.
Ok, thanks. -Edward
On Thu, 31 Oct 2002, M.-A. Lemburg wrote:
Think of it this way: Java programmers are usually very aware of JavaDoc style comments, so switching to epydoc for Python programming would probably cause them more trouble due to the subtle differences than someone who has never worked in this context before.
From the trenches, I've been doing Java development professionally since late 98. I've also been doing Python development since early 99 or so.
It took me 2 days to overhaul my Python project's API documentation to use Epytext with the @param things and I don't have any problem in keeping Javadoc and Epydoc tags separate in my mind. There are only a handful I use in each group, so it's pretty easy to keep straight. They could easily fit on a post-it note that could be stuck to your monitor. This isn't to say that there doesn't exist a group of folks who will get confused between the two, but I didn't. /will
Thanks for clarifying the differences between pydoc and epydoc. I think I'd like to see some of epydoc's features in pydoc, but I'm not sure how to do this given that the code bases are different, so I'll accept that these are separate tools. I do find epydoc's default color scheme hard to read (not enough contrast), but I'm sure that's customizable too. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (5)
-
Edward Loper -
Guido van Rossum -
Ka-Ping Yee -
M.-A. Lemburg -
will