Re: [Python-Dev] Pydoc Improvements / Rewrite
Ron, I agree that pydoc could benefit a bit from some cleanup. As you point it out, the ability to write quick viewers would be very helpful. I came across that when wanting to develop script on a remote web server for which I only had FTP access: I ended up having to study pydoc more than I wanted in order to be able to build a display-the-doc cgi. However having two different modules might not be needed. Introspection is probably already available in the separate module 'inspect', and what a code pydoc would have to do is model the documentation (as a tree) and offer convenience function to navigate the data. Beside that, there would be sub-modules for the different viewers for the documentation data - the interactive console being just one of the viewers. Finally, I would suspect that an API-breaking modification of the module would need time to be accepted. May be the original author of pydoc is considering changes as well, and joining effort would be possible ? L. PS: I would also not go for a module name deliberately prefixed with "_" (as some people might associate that with protected or private objects).
Laurent Gautier wrote:
Ron,
I agree that pydoc could benefit a bit from some cleanup. As you point it out, the ability to write quick viewers would be very helpful. I came across that when wanting to develop script on a remote web server for which I only had FTP access: I ended up having to study pydoc more than I wanted in order to be able to build a display-the-doc cgi.
However having two different modules might not be needed.
I think weather it is two modules or more, or a package, is still an open issue. Others have suggested it may be better for it to be package so I'm continuing in that direction.
Introspection is probably already available in the separate module 'inspect', and what a code pydoc would have to do is model the documentation (as a tree) and offer convenience function to navigate the data. Beside that, there would be sub-modules for the different viewers for the documentation data - the interactive console being just one of the viewers.
Pydoc currently uses functions from the inspect module along with directly accessing attributes and other information where it is available. It's not a replacement for the inspect module. My first attempt used an xml tree to store the information, but to make that work requires also storing a fair amount of meta information in the tree as well. I found parsing the tree and the meta information to be more complex than using an objective approach which is (to me) more readable and easier to extend. But if you want to try it again, please do. You may come up with something far better than I did.
Finally, I would suspect that an API-breaking modification of the module would need time to be accepted.
I was considering this for python 3.0, but one of the developers suggested it would be nice to have in python 2.6 and to move the discussion here. I think any API issues could be worked out. Are there any programs you know of, (yours?), that import pydoc besides the python console? May be the original author of pydoc is
considering changes as well, and joining effort would be possible ?
AUTHOR Ka-Ping Yee <ping@lfw.org> CREDITS Guido van Rossum, for an excellent programming language. Tommy Burnette, the original creator of manpy. Paul Prescod, for all his work on onlinehelp. Richard Chamberlain, for the first implementation of textdoc. Ka-Ping reads python-dev but I'm cc'ing this to him just in case. (I Used his python-dev email address since I don't know if the above one is current.) Pydoc is a fairly complex program and it would definitely help if others took a look at various parts and made contributions and or suggestions to making it better. I may have also gotten a bit over my head, but I'm willing to stick it out and try to get it finished with any suggestions (and help) that any one is willing to give me. There are also too many important issues for me to be decided, so this isn't something that can be done in isolation. The download link again is: http://ronadam.com/dl/_pydoc.zip It's not fully functional yet, but does run. Some parts like the command line file output options still need to be reimplemented. Some output formatting still needs to be cleaned up, and the MRO tree parsing section still needs to be put back in. One question that I have at the moment is: * Would it be good to have the "KEYWORD" and "TOPICS" info as included data objects or files, and possibly use that to generate the python html (and other) documentation for these? (Instead of the other way around like it is now.) This would eliminate the requirement to install something extra in order for help on these items to work.
L.
PS: I would also not go for a module name deliberately prefixed with "_" (as some people might associate that with protected or private objects).
The underscore was just a "temporary" convenience to avoid the name conflict with the existing module. Weather to reuse the old name or use a new name is still one of the many open issues I think. Ron
Hi Ron and Laurent, I welcome attempts to improve pydoc (especially since I don't have much time to work on improving it myself). I definitely agree that moving to CSS is long overdue, though I would like some input on the style of the produced pages. It's probably a good idea to explain how pydoc got to be the way that it is. The module boundary between inspect and pydoc is a pretty clear one, intended to isolate pydoc from future changes to Python's introspection features (such as attributes on internal types like frames and functions). On the other hand, I've often seen the question of why pydoc does both text and HTML generation instead of generating some intermediate data structure from which both kinds of output are produced. The answer is: I tried it. The result turned out to be longer than I expected and needlessly more complicated than what we have now. It may be that a better job could have been done, but I think there is a rational basis for why it turned out that way. The Python objects themselves already are a data structure containing all of the information we need. I discovered that translating this data structure into another data structure and then producing text or HTML was more work than simply producing text or HTML. With CSS, the last step gets even easier and so the intermediate stage becomes even less necessary. Also, the intermediate step required me to essentially invent an API, and I decided that I trusted the stability of Python's API more than that of some API I invented just for this. This is not to say that text and HTML generation can't be separated; it's just a caution against attempting to overgeneralize by creating an intermediate format. I'm glad you backed away from XML (or I'd have warned you that processing the XML would be a lot of extra work). The inspect module was intended to pull out as much as possible of the extraction functionality that's shared by the text and HTML documentation generators. But pydoc is still big. At the time I was proposing pydoc for addition to the standard library, I didn't want to pollute the top-level module namespace with too many names, so I tried hard to minimize the number of modules. And of course it has grown since then with bits of new functionality and support for new language features in Python. But now if a package is being considered, it makes sense to split out some of the pieces (as you have done), such as the web server, the search function, and the interactive interpreter help prompt. It may even enable pydoc to provide search from the interactive help prompt, which would be a nice feature! The package could contain several modules for ease of maintenance, while still providing a single, convenient command for running pydoc from the Unix prompt. -- ?!ng
Ka-Ping Yee wrote:
Hi Ron and Laurent,
I welcome attempts to improve pydoc (especially since I don't have much time to work on improving it myself). I definitely agree that moving to CSS is long overdue, though I would like some input on the style of the produced pages.
Additional input would be good. The html output I used is nearly pure nested definition lists with CSS styling to set the fonts, borders, and indents. It was A bit tricky in some places to keep it looking like the current pydoc pages. My mental target was something that would both look good printed and also fit in with Pythons current web site design while not changing it too much. Changing the CSS file to produce other output styled pages should not be that difficult. A little experimenting would be good in order to find where additional style tags in the html code may be needed.
It's probably a good idea to explain how pydoc got to be the way that it is. The module boundary between inspect and pydoc is a pretty clear one, intended to isolate pydoc from future changes to Python's introspection features (such as attributes on internal types like frames and functions).
On the other hand, I've often seen the question of why pydoc does both text and HTML generation instead of generating some intermediate data structure from which both kinds of output are produced. The answer is: I tried it. The result turned out to be longer than I expected and needlessly more complicated than what we have now. It may be that a better job could have been done, but I think there is a rational basis for why it turned out that way.
Yes, I found it was a trade off from one type of complexity to another. And I didn't like importing something that will probably go through more changes like xmltree.
The Python objects themselves already are a data structure containing all of the information we need. I discovered that translating this data structure into another data structure and then producing text or HTML was more work than simply producing text or HTML. With CSS, the last step gets even easier and so the intermediate stage becomes even less necessary. Also, the intermediate step required me to essentially invent an API, and I decided that I trusted the stability of Python's API more than that of some API I invented just for this.
This is not to say that text and HTML generation can't be separated; it's just a caution against attempting to overgeneralize by creating an intermediate format. I'm glad you backed away from XML (or I'd have warned you that processing the XML would be a lot of extra work).
The inspect module was intended to pull out as much as possible of the extraction functionality that's shared by the text and HTML documentation generators. But pydoc is still big. At the time I was proposing pydoc for addition to the standard library, I didn't want to pollute the top-level module namespace with too many names, so I tried hard to minimize the number of modules. And of course it has grown since then with bits of new functionality and support for new language features in Python.
And it will continue to grow as python does. Hopefully we can make the process of supporting new language features easier.
But now if a package is being considered, it makes sense to split out some of the pieces (as you have done), such as the web server, the search function, and the interactive interpreter help prompt. It may even enable pydoc to provide search from the interactive help prompt, which would be a nice feature!
I think that could be done without too much trouble. It only takes adding a new allcaps word "FIND <something>" or "SEARCH <something>", in addition to KEYWORDS and TOPICS.
The package could contain several modules for ease of maintenance, while still providing a single, convenient command for running pydoc from the Unix prompt.
I was thinking of two convenient entry points. One for text and the interactive console and one for html, and the web browser interface. pyhelp and pydoc respectfully. There is also the possibility of splitting it into two much smaller packages, one for the command line and interactive help console. No html stuff or server stuff here. This could be better controlled and maintained as it's used in pythons console. Another plus is it will be easier to maintain as well. The other package (or module) would be an example of how to extend or build an application, an html formatter and help browser in this case, from the console help package. Cheers, Ron
2007/1/5, Ka-Ping Yee <python-dev@zesty.ca>: [cut]
On the other hand, I've often seen the question of why pydoc does both text and HTML generation instead of generating some intermediate data structure from which both kinds of output are produced. The answer is: I tried it. The result turned out to be longer than I expected and needlessly more complicated than what we have now. It may be that a better job could have been done, but I think there is a rational basis for why it turned out that way.
The Python objects themselves already are a data structure containing all of the information we need. I discovered that translating this data structure into another data structure and then producing text or HTML was more work than simply producing text or HTML. With CSS, the last step gets even easier and so the intermediate stage becomes even less necessary. Also, the intermediate step required me to essentially invent an API, and I decided that I trusted the stability of Python's API more than that of some API I invented just for this.
Point well taken. This is very sensible. I would still try to keep common-and-presenter-independent component. Rather than a sole distinction console/HTML, I would think of a distinction between stateless and interactive presenters, and still have the console and static HTML as specific presenters. The search functions Ron suggested would be part of that presenter-independent part (and for the refinement, stateless vs. interactive would make sense for performances). The distinction may look like an unnecessary complication... but I would think that it does not have to be complicated, and that the number of practical things it would allow would make it almost necessary (ah ! delusions ;-) ). There a number of python editors/consoles/IDE around, some of which are implemented in python, and having the necessary infrastructute to let implement easily documentation presenters would be very nice.
This is not to say that text and HTML generation can't be separated; it's just a caution against attempting to overgeneralize by creating an intermediate format. I'm glad you backed away from XML (or I'd have warned you that processing the XML would be a lot of extra work).
Your warning regarding the creation of a n-th data structure is completely agreed upon. I also understand your point about the dangers of overgeneralizing.
The inspect module was intended to pull out as much as possible of the extraction functionality that's shared by the text and HTML documentation generators. But pydoc is still big. At the time I was proposing pydoc for addition to the standard library, I didn't want to pollute the top-level module namespace with too many names, so I tried hard to minimize the number of modules. And of course it has grown since then with bits of new functionality and support for new language features in Python.
But now if a package is being considered, it makes sense to split out some of the pieces (as you have done), such as the web server, the search function, and the interactive interpreter help prompt. It may even enable pydoc to provide search from the interactive help prompt, which would be a nice feature! The package could contain several modules for ease of maintenance, while still providing a single, convenient command for running pydoc from the Unix prompt.
Having things already split by Ron is probably a good starting base (and generalization introduced progressively, if agreed upon). I see that there is debating on the format for documentation strings, may there is as well room for flexibility regarding how the strings are utilized. The search would be not only useful to the python console, but also to other editors, as well as to editor (as well as python programs), as well as to stateless presenters (the case I had was to work on a server (web-hosting) on which I only had FTP access and on which I did not know the python version or the package installed -hey, what about an Ajax-capable HTML viewer for the documentation ?) - Laurent
-- ?!ng
Ron, Thanks for your detailed answer. I inserted comments below. 2007/1/5, Ron Adam <rrr@ronadam.com>:
Laurent Gautier wrote: [cut]
Introspection is probably already available in the separate module 'inspect', and what a code pydoc would have to do is model the documentation (as a tree) and offer convenience function to navigate the data. Beside that, there would be sub-modules for the different viewers for the documentation data - the interactive console being just one of the viewers.
Pydoc currently uses functions from the inspect module along with directly accessing attributes and other information where it is available. It's not a replacement for the inspect module.
My first attempt used an xml tree to store the information, but to make that work requires also storing a fair amount of meta information in the tree as well. I found parsing the tree and the meta information to be more complex than using an objective approach which is (to me) more readable and easier to extend. But if you want to try it again, please do. You may come up with something far better than I did.
Well, I was coining the idea from the understanding that the main split was console viewer vs other viewer. I was thinking of something a design along the lines of the Model-View-Presenter pattern.... and I guess that you will have to read your code if I want to debate on that ;-).
Finally, I would suspect that an API-breaking modification of the module would need time to be accepted.
I was considering this for python 3.0, but one of the developers suggested it would be nice to have in python 2.6 and to move the discussion here.
I think any API issues could be worked out. Are there any programs you know of, (yours?), that import pydoc besides the python console?
What I did barely qualifies as a hack for my own usage -it won't count-.
From the top of my head, there might be "ipython" (the excellent interactive console) is possibly using pydoc (in any case, I would say that the authors would be interested in developments with pydoc)
Otherwise a quick search lead to: - "cgitb" (!? - it seems that the HTML formatting functions of pydoc are only in use - wouldn't these functions belong more naturally to "cgi" ?) - "DocXMLRPCServer" (hey ! it looks like kind-of what I was needing !!!). - "happydoc" (reportedly having problems with python 2.4 - I am not sure that it is maintained) "cgitb" and "DocXMLRPCServer" are both distributed bundled with Python. "cgitb" seems to be mostly using HTML formatting helpers (and that would suggest the need for an HTML-rendering module - may be for a future improvement, a first step would be separate the rendering/viewing from extraction and modeling of documentation data). "DocXMLRPCServer" looks (at first sight), like a viewer that would be bundled with pydoc as a sub-module (i.e., module in a package).
[cut: Ka-Ping Yee <ping@lfw.org> is now in the loop]
Pydoc is a fairly complex program and it would definitely help if others took a look at various parts and made contributions and or suggestions to making it better.
Well, I stumbled upon your recent posts in python-ideas (that I tracked up the one in python-devel) because I looked into it I thought that it would be a *lot* of work for one person. (more on that in the next inlined comment)
I may have also gotten a bit over my head, but I'm willing to stick it out and try to get it finished with any suggestions (and help) that any one is willing to give me. There are also too many important issues for me to be decided, so this isn't something that can be done in isolation.
The download link again is:
I would be willing to help out, as probably others will as well (I found blogs and posts of people discussing pydoc, it might be worthwhile dropping a line to the people - we can discuss that off-list if you wish), but may be at one condition. I do not think it will work as a zip file shuttled around (in my experience). A versioning system would be extremely helpful (SVN, or CVS. would come to my mind). Well, if you are ok with having the source tree hosted in a SVN/CVS/alike I am on (opening an account on sourceforge or savannah -for example- would be the next step then, as it can take few days for a project to be approved)
It's not fully functional yet, but does run. Some parts like the command line file output options still need to be reimplemented. Some output formatting still needs to be cleaned up, and the MRO tree parsing section still needs to be put back in.
I will have a look then.
One question that I have at the moment is:
* Would it be good to have the "KEYWORD" and "TOPICS" info as included data objects or files, and possibly use that to generate the python html (and other) documentation for these? (Instead of the other way around like it is now.)
This would eliminate the requirement to install something extra in order for help on these items to work.
I see that I was slow to write this email. I will read the following before commenting further. L.
L.
PS: I would also not go for a module name deliberately prefixed with "_" (as some people might associate that with protected or private objects).
The underscore was just a "temporary" convenience to avoid the name conflict with the existing module. Weather to reuse the old name or use a new name is still one of the many open issues I think.
Ron
On Fri, Jan 05, 2007 at 06:01:22PM +0800, Laurent Gautier wrote:
Well, if you are ok with having the source tree hosted in a SVN/CVS/alike I am on (opening an account on sourceforge or savannah -for example- would be the next step then, as it can take few days for a project to be approved)
The Python SVN repository has a sandbox/ directory that's intended for storing code in development; you could certainly use that. --amk
2007/1/5, A.M. Kuchling <amk@amk.ca>:
On Fri, Jan 05, 2007 at 06:01:22PM +0800, Laurent Gautier wrote:
Well, if you are ok with having the source tree hosted in a SVN/CVS/alike I am on (opening an account on sourceforge or savannah -for example- would be the next step then, as it can take few days for a project to be approved)
The Python SVN repository has a sandbox/ directory that's intended for storing code in development; you could certainly use that.
I suspect that this is aside from the rest of the python source tree. (or I would anticipate peppered emails if the module is broken during its early days -and it will- ). I also suspect that write access is not granted easily (as I know a number of python modules being on sourceforge before being ultimately included in the standard modules).
--amk
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lgautier%40gmail.com
On Fri, Jan 05, 2007 at 09:12:28PM +0800, Laurent Gautier wrote:
I suspect that this is aside from the rest of the python source tree. (or I would anticipate peppered emails if the module is broken during its early days -and it will- ).
Correct; it can be browsed at <http://svn.python.org/view/sandbox/trunk/>
I also suspect that write access is not granted easily (as I know a number of python modules being on sourceforge before being ultimately included in the standard modules).
I think most of those projects weren't initially started with the goal of stdlib inclusion. The problems with a separate Sourceforge project are: * fewer people will see the commit messages. * if the project isn't completed, everyone will forget about the SF project in a few year. In sandbox/, at least the code will still be somewhat visible; maybe someone would finish it. --amk
Laurent Gautier wrote:
Ron,
Thanks for your detailed answer. I inserted comments below.
You welcome.
I think any API issues could be worked out. Are there any programs you know of, (yours?), that import pydoc besides the python console?
What I did barely qualifies as a hack for my own usage -it won't count-.
It could be these changes will give you a way to do the same thing in a less haskish way.
From the top of my head, there might be "ipython" (the excellent interactive console) is possibly using pydoc (in any case, I would say that the authors would be interested in developments with pydoc)
According to the web site, ipython is based on twisted, and is currently still limited to python 2.3 and 2.4. Also, the output of the help() function will not change much so I doubt it would be a problem for them.
Otherwise a quick search lead to: - "cgitb" (!? - it seems that the HTML formatting functions of pydoc are only in use - wouldn't these functions belong more naturally to "cgi" ?)
Thanks!, These html formatting functions still exist or are small enough to move into cgitb, so it will be just a matter of making sure they can be reached. I don't think they will be a problem.
- "DocXMLRPCServer" (hey ! it looks like kind-of what I was needing !!!).
Thanks again. This might be better to move into the pydoc package. Any opinions?
- "happydoc" (reportedly having problems with python 2.4 - I am not sure that it is maintained)
Happydoc does not import pydoc as far as I could tell, so this won't effect them in any direct way I think. They've pretty much implemented everything from scratch. At worse they would just need to copy the parts from the older version into their distribution. I think you got a false positive on this because pydoc is a substring of happydoc.
"cgitb" and "DocXMLRPCServer" are both distributed bundled with Python.
"cgitb" seems to be mostly using HTML formatting helpers (and that would suggest the need for an HTML-rendering module - may be for a future improvement, a first step would be separate the rendering/viewing from extraction and modeling of documentation data).
Making sure these still work would be a good sub project for someone a little later. (I'll do it if no one else has time or wants to.) I'm trying not to change thing to drastically. If the changes are too big, ie... introducing and altering a lot of other modules. Then this will need to move back to the python-3000 list.
"DocXMLRPCServer" looks (at first sight), like a viewer that would be bundled with pydoc as a sub-module (i.e., module in a package).
Yes, that was my thought too. :-) But moving it may need to wait until python-3000. (?)
Pydoc is a fairly complex program and it would definitely help if others took a look at various parts and made contributions and or suggestions to making it better.
Well, I stumbled upon your recent posts in python-ideas (that I tracked up the one in python-devel) because I looked into it I thought that it would be a *lot* of work for one person. (more on that in the next inlined comment)
It was tedious going though the module, but now that it's split up into smaller parts it shouldn't be too difficult.
I may have also gotten a bit over my head, but I'm willing to stick it out and try to get it finished with any suggestions (and help) that any one is willing to give me. There are also too many important issues for me to be decided, so this isn't something that can be done in isolation.
The download link again is:
I would be willing to help out, as probably others will as well (I found blogs and posts of people discussing pydoc, it might be worthwhile dropping a line to the people - we can discuss that off-list if you wish), but may be at one condition.
I do not think it will work as a zip file shuttled around (in my experience). A versioning system would be extremely helpful (SVN, or CVS. would come to my mind). Well, if you are ok with having the source tree hosted in a SVN/CVS/alike I am on (opening an account on sourceforge or savannah -for example- would be the next step then, as it can take few days for a project to be approved)
I don't have any experience with SVN, but it could be an opportunity to learn something new. I have a couple of other projects that could benefit by moving them to SVN or CVS like system. What I intended was to get enough feed back that I could get it to a point where it's 90% done and then upload it as a patch where it could be further polished up. To do that, I first need to verify my design goals are the correct approach. (see my reply to Ka-Ping.) If they are, then it's more a matter of cleaning up what I've already started. If not, then it's a bit (or bunch) more work and would benefit from having others work on it in a more formal way. If someone who has more experience with group projects would like to manage it, that would be good too. That may speed things up considerably.
I will have a look then.
Great. :-) Cheers, Ron
Ron Adam wrote:
Laurent Gautier wrote:
From the top of my head, there might be "ipython" (the excellent interactive console) is possibly using pydoc (in any case, I would say that the authors would be interested in developments with pydoc)
Certainly :) I'd like to ask whether this discussion considers any kind of merging of pydoc with epydoc. Many projects (ipython included) are moving towards epydoc as a more capable system than pydoc, though it would be nice to see its API-doc-generation capabilities be part of the stdlib. I don't know if that's considered either too large or too orthogonal to the current direction.
According to the web site, ipython is based on twisted, and is currently still limited to python 2.3 and 2.4. Also, the output of the help() function will not change much so I doubt it would be a problem for them.
A few corrections: - IPython's trunk is NOT based on twisted at all, it's a self-contained Python package which depends only on the Python standard library (plus readline support under windows, which we also provide but requires ctypes). - The ipython development branch does use twisted, but only for its distributed and parallel computing capabilities. Eventually when that branch becomes trunk, there will /always/ be a non-twisted component for local, terminal-based work. - The last release (0.7.3) fully supports Python 2.5. In fact, one nasty bug in 2.5 with extremely slow traceback generation was kindly fixed by python-dev in the nick of time after my pestering (an ipython user found it first and brought it to my attention).
Otherwise a quick search lead to: - "cgitb" (!? - it seems that the HTML formatting functions of pydoc are only in use - wouldn't these functions belong more naturally to "cgi" ?)
Thanks!, These html formatting functions still exist or are small enough to move into cgitb, so it will be just a matter of making sure they can be reached. I don't think they will be a problem.
If anyone is interested in updating cgitb, you might want to look at ipython's ultratb (which was born as a cgitb port to ANSI terminals): http://projects.scipy.org/ipython/ipython/browser/ipython/trunk/IPython/ultr... It contains functionality for generating /extremely/ detailed tracebacks with gobs of local detail. These verbose tracebacks have let me fix many ipython bugs from crash dumps triggered by remote code and libraries I don't have access to, in cases where a normal traceback would have been insufficient. Here's a link to a slightly outdated simple example (the formatting has improved a bit): http://ipython.scipy.org/screenshots/snapshot1.png Obviously the right thing to do would be to separate the ANSI coloring from the structural formatting, so that the traceback could be formatted as HTML, ANSI colors or anything else. That is very much on my todo list, since the network-enabled ipython will have browser-based interfaces in the future. All of ipython is BSD licensed, so any code in there is for the taking. Best, f
At 06:35 PM 1/5/2007 -0700, Fernando Perez wrote:
Ron Adam wrote:
Laurent Gautier wrote:
From the top of my head, there might be "ipython" (the excellent interactive console) is possibly using pydoc (in any case, I would say that the authors would be interested in developments with pydoc)
Certainly :) I'd like to ask whether this discussion considers any kind of merging of pydoc with epydoc.
Unless there's been a complete rewrite of epydoc since the last time I looked at it, I'd have to give a very strong -1 against epydoc; it has all the problems of pydoc, plus new ones.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jan 5, 2007, at 9:41 PM, Phillip J. Eby wrote:
Unless there's been a complete rewrite of epydoc since the last time I looked at it, I'd have to give a very strong -1 against epydoc; it has all the problems of pydoc, plus new ones.
I haven't read this entire thread, so I'll just chime in to say that I've /used/ epydoc and like it quite a bit. I've even hacked on it a little to fix a few things and it didn't seem that bad, though I didn't do any major work on its internals. I've used both the 2.x version and the 3.x version but I haven't used anything in the last, I dunno, 4 or 5 months. Note that I was using it on a heavily embedded/extended application and it did a pretty good job of pulling docs out of C coded docstrings. I had to patch Python a bit here and there (I think most of those fixes are in Python 2.5) and I know that the epydoc guys fixed a few things related to C types (e.g. such as that the tp_doc has to document both the type and the constructor). Probably the biggest issue that I remember was needing to invoke it programmatically, which was an absolute requirement for us, since none of the extension modules were importable unless epydoc was run inside the embedded environment. I got it to work, but it was a bit of a pain. If you've already explained it, that's fine, but if not, could you outline what you have against epydoc? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (Darwin) iQCVAwUBRZ8ww3EjvBPtnXfVAQJmnwQAnTn7W7Nri5Q+pSPmTLaIvqnmRJWDegpF HSIDY8nBcMwsST76gzpwt02GikWtyy4gujgHiAEyr4/eQyJsMcnXMptkgXixtuPz wA2pJkeo87eorPBMtOMoB9XpoyUkQTh5W/lGnR3rOinZPeiqJFEzc//DIJV+H/p7 Iqrie+FVnis= =sNBX -----END PGP SIGNATURE-----
At 12:16 AM 1/6/2007 -0500, Barry Warsaw wrote:
If you've already explained it, that's fine, but if not, could you outline what you have against epydoc?
The last I tried to work with it, it had even more hardcoded typechecking than pydoc does, spread out over more of the code base. Also, over at OSAF, I've been repeatedly called upon to sort out some peculiarity in how it discovers things or how it handles types it doesn't recognize. My net impression has been that it's very brittle, even more so than pydoc. On the plus side, there are some very good ideas and a *lot* of good execution in there, but its extensibility has historically struck me as non-existent. To be fair, the last time I had to deal with any problems with it at OSAF was almost a year ago, if memory serves. I don't know if anything has improved in it since then. The last time I seriously analyzed its internal architecture was several years ago (maybe 5?) when I was investigating it as an alternative to HappyDoc for doing PEAK's API documentation. I could never get it to work on anything but a small subset of PEAK without crashing in any of several ways, including segfaulting its GUI! It had built into it a variety of restrictive assumptions about how programs are structured that were not compatible with what I was doing. pydoc at least only crashed when dealing with metaclass instances, but I believe that was fixed in 2.3 or a late 2.2.x release. Anyway, I like the *idea* of epydoc and a lot of its execution, but IMO it needs just as much work as pydoc, if not more.
Phillip J. Eby wrote:
At 12:16 AM 1/6/2007 -0500, Barry Warsaw wrote:
If you've already explained it, that's fine, but if not, could you outline what you have against epydoc?
The last I tried to work with it, it had even more hardcoded typechecking than pydoc does, spread out over more of the code base. Also, over at OSAF, I've been repeatedly called upon to sort out some peculiarity in how it discovers things or how it handles types it doesn't recognize.
My net impression has been that it's very brittle, even more so than pydoc.
On the plus side, there are some very good ideas and a *lot* of good execution in there, but its extensibility has historically struck me as non-existent.
To be fair, the last time I had to deal with any problems with it at OSAF was almost a year ago, if memory serves. I don't know if anything has improved in it since then.
FWIW, a 3.0a3 was released in August 2006, and according to the History, "Significant portions of epydoc were written for version 3.0." It seems a lot of that was to add parsing as a complementary means to extract documentation. I'm not particularly familiar with the introspection code of either 2.1 or 3.0a3, but a cursory examination shows that 3.0a3 has an introspecter registry that 2.1 doesn't: # In epydoc.docintrospecter: def register_introspecter(applicability_test, introspecter, priority=10): """ Register an introspecter function. Introspecter functions take two arguments, a python value and a C{ValueDoc} object, and should add information about the given value to the the C{ValueDoc}. Usually, the first line of an inspecter function will specialize it to a sublass of C{ValueDoc}, using L{ValueDoc.specialize_to()}: >>> def typical_introspecter(value, value_doc): ... value_doc.specialize_to(SomeSubclassOfValueDoc) ... <add info to value_doc> @param priority: The priority of this introspecter, which determines the order in which introspecters are tried -- introspecters with lower numbers are tried first. The standard introspecters have priorities ranging from 20 to 30. The default priority (10) will place new introspecters before standard introspecters. """ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
2007/1/6, Ron Adam <rrr@ronadam.com>:
Laurent Gautier wrote: [cut]
I think any API issues could be worked out. Are there any programs you know of, (yours?), that import pydoc besides the python console?
What I did barely qualifies as a hack for my own usage -it won't count-.
It could be these changes will give you a way to do the same thing in a less haskish way.
This precisely why I get myself into the present trouble ;-)
From the top of my head, there might be "ipython" (the excellent interactive console) is possibly using pydoc (in any case, I would say that the authors would be interested in developments with pydoc)
According to the web site, ipython is based on twisted, and is currently still limited to python 2.3 and 2.4. Also, the output of the help() function will not change much so I doubt it would be a problem for them.
Sorry for answering a bit off-the-question. My meaning was that they would be interested in knowning that "pydoc" is changing (and would surely have ideas).
Otherwise a quick search lead to: - "cgitb" (!? - it seems that the HTML formatting functions of pydoc are only in use - wouldn't these functions belong more naturally to "cgi" ?)
Thanks!, These html formatting functions still exist or are small enough to move into cgitb, so it will be just a matter of making sure they can be reached. I don't think they will be a problem.
I read your comment about having not too many things changed for 2.6. (or that will be bumped to 3000). A suggestion I would have would be to create an html/htmlrender module in the pydoc-package-to-be and start putting all the html formatting function (as they are completely independent of pydoc, as far as I can see, over there). You can then create wrappers to the original functions including a deprecation warning. You can refer to Michael Chermside's recipe for an example of implementation with a deprecation decorator - http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/391367 ) The suggestion above would actually apply to *anything* that is changed in pydoc, providing the benefit of allowing the necessary changes while having a temporary API to provide back-compatibility.
- "DocXMLRPCServer" (hey ! it looks like kind-of what I was needing !!!).
Thanks again. This might be better to move into the pydoc package. Any opinions?
We both pretty much agree, but now we've got to find the modules depending on DocXMLRPCServer (and hope the recursion won't go on for too long). We would also have to contact the author of DocXMLRPCServer (moving it would go with potential changes, and he would certainly have suggestions about that).
- "happydoc" (reportedly having problems with python 2.4 - I am not sure that it is maintained)
Happydoc does not import pydoc as far as I could tell, so this won't effect them in any direct way I think. They've pretty much implemented everything from scratch. At worse they would just need to copy the parts from the older version into their distribution.
I think you got a false positive on this because pydoc is a substring of happydoc.
Yes this was a false positive. I kept it in the list thinking there might be interesting ideas there as well (but I forget to label it as such - sorry for the confusion).
"cgitb" and "DocXMLRPCServer" are both distributed bundled with Python.
"cgitb" seems to be mostly using HTML formatting helpers (and that would suggest the need for an HTML-rendering module - may be for a future improvement, a first step would be separate the rendering/viewing from extraction and modeling of documentation data).
Making sure these still work would be a good sub project for someone a little later. (I'll do it if no one else has time or wants to.) I'm trying not to change thing to drastically. If the changes are too big, ie... introducing and altering a lot of other modules. Then this will need to move back to the python-3000 list.
I agree that altering a number of other modules would be a little tricky to manage for version 2.6, but I would think that the makeover of pydoc would benefit from being made early (back-compatibility being ensured by the deprecation decorator mentioned above, for example).
"DocXMLRPCServer" looks (at first sight), like a viewer that would be bundled with pydoc as a sub-module (i.e., module in a package).
Yes, that was my thought too. :-)
But moving it may need to wait until python-3000. (?)
An other way is to move it, while keeping a dummy module that imports the module from its new location (and issue deprecation warnings). What I fear is that the "let's wait for python-3000" correspond to postponing changes to "some other time". [cut]
I do not think it will work as a zip file shuttled around (in my experience). A versioning system would be extremely helpful (SVN, or CVS. would come to my mind). Well, if you are ok with having the source tree hosted in a SVN/CVS/alike I am on (opening an account on sourceforge or savannah -for example- would be the next step then, as it can take few days for a project to be approved)
I don't have any experience with SVN, but it could be an opportunity to learn something new. I have a couple of other projects that could benefit by moving them to SVN or CVS like system.
I have seen A.M.Kuching's suggestion about python's sandbox, but I do not know if it would work in the short term. Anyway, and at least for the long(er) term, it will make sense to favor SVN over CVS (since this is what the python project is using). That rules out savannah and leaves us with sourceforge.
What I intended was to get enough feed back that I could get it to a point where it's 90% done and then upload it as a patch where it could be further polished up. To do that, I first need to verify my design goals are the correct approach. (see my reply to Ka-Ping.) If they are, then it's more a matter of cleaning up what I've already started. If not, then it's a bit (or bunch) more work and would benefit from having others work on it in a more formal way.
No matter the number of developpers, it would give people interested in following the developpements a convenient way to do it.
If someone who has more experience with group projects would like to manage it, that would be good too. That may speed things up considerably.
I have some experience in it (in companies, and in an open source project) I can always file a application for a sourceforge project, and can help you with managing it until you feel like taking it on your own (or it is merged with the python trunk) I am willing to contribute to the implementation (I suspect that things like unit tests be needed).
I will have a look then.
Great. :-)
I should have the time during the week-end. I will get back to you off-list. Cheers, L.
Cheers, Ron
Laurent Gautier wrote:
2007/1/6, Ron Adam <rrr@ronadam.com>:
Laurent Gautier wrote:
[...]
I read your comment about having not too many things changed for 2.6. (or that will be bumped to 3000).
A suggestion I would have would be to create an html/htmlrender module in the pydoc-package-to-be and start putting all the html formatting function (as they are completely independent of pydoc, as far as I can see, over there).
There really isn't as many html specific formatting functions as you might think since I used a very consistent format of css tagged definition lists. In the case of cgitb, it's probably better to copy those functions to it and just get rid of the dependency. If someone does put together a library of useful html functions and class's then pydoc, cgitb and other programs can use it. BTW, all the html specific functions and formatting have already been collected in the gethtml.py file. I've also added more comments to both gettext.py and gethtml.py last night, so it should be easier to see how it works. First look though gettext.py to get a general idea of how the info is collected and formatted in text, then look at gethtml.py. http://ronadam.com/dl/_pydoc.zip
You can then create wrappers to the original functions including a deprecation warning. You can refer to Michael Chermside's recipe for an example of implementation with a deprecation decorator - http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/391367 ) The suggestion above would actually apply to *anything* that is changed in pydoc, providing the benefit of allowing the necessary changes while having a temporary API to provide back-compatibility.
Yes, that would be good and may be needed, but it's still a ways off. Lets get it to work first, then if it gets approved for inclusion, then how to move it into the python distribution (with any needed changes needed for that) can be worked out. First things first, if you know what I mean.
[...]
I do not think it will work as a zip file shuttled around (in my experience). A versioning system would be extremely helpful (SVN, or CVS. would come to my mind). Well, if you are ok with having the source tree hosted in a SVN/CVS/alike I am on (opening an account on sourceforge or savannah -for example- would be the next step then, as it can take few days for a project to be approved)
I don't have any experience with SVN, but it could be an opportunity to learn something new. I have a couple of other projects that could benefit by moving them to SVN or CVS like system.
I have seen A.M.Kuching's suggestion about python's sandbox, but I do not know if it would work in the short term. Anyway, and at least for the long(er) term, it will make sense to favor SVN over CVS (since this is what the python project is using). That rules out savannah and leaves us with sourceforge.
I'd like to know more about using the sandbox, I know it would be easy for people to read the source there, but who all can have write access to it without having write access to other python areas? I would not mind giving that a try if someone who already knows how could point me to the correct how-to documentation with some advice on what not to do. I'm actually more concerned about the what not to do stuff. I really would not like to clobber someone else's work or create problems because of my inexperience with CVS.
[...]
If someone who has more experience with group projects would like to manage it, that would be good too. That may speed things up considerably.
I have some experience in it (in companies, and in an open source project) I can always file a application for a sourceforge project, and can help you with managing it until you feel like taking it on your own (or it is merged with the python trunk)
I don't see this as taking a long time if we keep it to cleaning up with some API and user interface improvements. I know there are some here who want a smart parsing engine, which probably would take a long term commitment to maintain and fix bugs, etc. But lets look at the actual use's that pydoc serves first. Use's for pydoc in order of importance and frequency of use: 1. Console (builtin) help. ie.. the help() function. 2. HTML browsing and quick reference. 3. Document generation in text. 4. Document generation in html. 5. Document generation in other formats. (not currently possible) I'm concentrating on 1 and 2. Use cases 3 and 4 are just an easy to do byproduct of doing 1 and 2. I think the cleaning up may make doing 5 possible. Lets turn the question around. How well would other document generators supply pydoc with the equivalent text of the help() function, interactive help session output, and the equivalent html needed for dynamic html browsing? Also keep in mind the help function is always by default imported into python, so keeping that small and relatively simple with a minimum of external dependencies is good.
I am willing to contribute to the implementation (I suspect that things like unit tests be needed).
Yes, there will need to be some unit tests. It may even help for those be written now. That would help us identify things that still need to be done. [...]
I should have the time during the week-end. I will get back to you off-list.
Cool. :-) Cheers, Ron
2007/1/7, Ron Adam <rrr@ronadam.com>:
2007/1/6, Ron Adam <rrr@ronadam.com>: [...] I'd like to know more about using the sandbox, I know it would be easy for
Laurent Gautier wrote: people to read the source there, but who all can have write access to it without having write access to other python areas? I would not mind giving that a try if someone who already knows how could point me to the correct how-to documentation with some advice on what not to do.
Limiting where different people can commit code changes is possible... it's just that I am not certain whether sourceforge allows it or not. I asked A.M. Kuchling about that.
I'm actually more concerned about the what not to do stuff. I really would not like to clobber someone else's work or create problems because of my inexperience with CVS.
I see that you are under Microsoft windows, so you may want to check TortoiseSVN. (The python project is stored on a SVN server, so it would make sense to favor this one over CVS - in the case the project administrators have directory-level control -). Regarding the possibility of jeopardizing something in the repository, the directory-level sandbox should only allow you trash your own work ;-) (but even then, you should always be able to recover from mistakes).
[...]
If someone who has more experience with group projects would like to manage it, that would be good too. That may speed things up considerably.
I have some experience in it (in companies, and in an open source project) I can always file a application for a sourceforge project, and can help you with managing it until you feel like taking it on your own (or it is merged with the python trunk)
I don't see this as taking a long time if we keep it to cleaning up with some API and user interface improvements.
That's all in the meaning of "some" I guess... ;-)
I know there are some here who want a smart parsing engine, which probably would take a long term commitment to maintain and fix bugs, etc. But lets look at the actual use's that pydoc serves first.
Use's for pydoc in order of importance and frequency of use:
1. Console (builtin) help. ie.. the help() function. 2. HTML browsing and quick reference. 3. Document generation in text. 4. Document generation in html. 5. Document generation in other formats. (not currently possible)
I'm concentrating on 1 and 2. Use cases 3 and 4 are just an easy to do byproduct of doing 1 and 2. I think the cleaning up may make doing 5 possible.
I am fully on that line, with the remark that thinking about point 5 early is that could make the cut. The exercise will be in avoiding over-complication in a design that is not used in the end. Reactions on this thread brought a lot of good ideas and pointers to existing work. It loo
Lets turn the question around. How well would other document generators supply pydoc with the equivalent text of the help() function, interactive help session output, and the equivalent html needed for dynamic html browsing?
Also keep in mind the help function is always by default imported into python, so keeping that small and relatively simple with a minimum of external dependencies is good.
I am willing to contribute to the implementation (I suspect that things like unit tests be needed).
Yes, there will need to be some unit tests. It may even help for those be written now. That would help us identify things that still need to be done.
Regarding tests, it is never early enough to think about them (that let one write code that is actually "test-able").
[...]
Laurent Gautier wrote:
2007/1/7, Ron Adam <rrr@ronadam.com>:
2007/1/6, Ron Adam <rrr@ronadam.com>: [...] I'd like to know more about using the sandbox, I know it would be easy for
Laurent Gautier wrote: people to read the source there, but who all can have write access to it without having write access to other python areas? I would not mind giving that a try if someone who already knows how could point me to the correct how-to documentation with some advice on what not to do.
Limiting where different people can commit code changes is possible... it's just that I am not certain whether sourceforge allows it or not. I asked A.M. Kuchling about that.
I'm not concerned about limiting changes to this project. I want others to work on it. Can write access be *easily* granted to just one cvs sandbox directory, for this project, without granting access to other directories in the sandbox or trunk?
I'm actually more concerned about the what not to do stuff. I really would not like to clobber someone else's work or create problems because of my inexperience with CVS.
I see that you are under Microsoft windows, so you may want to check TortoiseSVN. (The python project is stored on a SVN server, so it would make sense to favor this one over CVS - in the case the project administrators have directory-level control -).
Thank's I'll try TortoiseSVN out. :-)
Regarding the possibility of jeopardizing something in the repository, the directory-level sandbox should only allow you trash your own work ;-) (but even then, you should always be able to recover from mistakes).
That would be fine then. But I'll let you decide since you are offering to manage getting it set up.
[...]
If someone who has more experience with group projects would like to manage it, that would be good too. That may speed things up considerably.
I have some experience in it (in companies, and in an open source project) I can always file a application for a sourceforge project, and can help you with managing it until you feel like taking it on your own (or it is merged with the python trunk)
I don't see this as taking a long time if we keep it to cleaning up with some API and user interface improvements.
That's all in the meaning of "some" I guess... ;-)
Yep.
I know there are some here who want a smart parsing engine, which probably would take a long term commitment to maintain and fix bugs, etc. But lets look at the actual use's that pydoc serves first.
Use's for pydoc in order of importance and frequency of use:
1. Console (builtin) help. ie.. the help() function. 2. HTML browsing and quick reference. 3. Document generation in text. 4. Document generation in html. 5. Document generation in other formats. (not currently possible)
I'm concentrating on 1 and 2. Use cases 3 and 4 are just an easy to do byproduct of doing 1 and 2. I think the cleaning up may make doing 5 possible.
I am fully on that line, with the remark that thinking about point 5 early is that could make the cut. The exercise will be in avoiding over-complication in a design that is not used in the end.
Reactions on this thread brought a lot of good ideas and pointers to existing work. It loo
Yes, It does help to have additional view points and references.
Lets turn the question around. How well would other document generators supply pydoc with the equivalent text of the help() function, interactive help session output, and the equivalent html needed for dynamic html browsing?
Also keep in mind the help function is always by default imported into python, so keeping that small and relatively simple with a minimum of external dependencies is good.
I am willing to contribute to the implementation (I suspect that things like unit tests be needed).
Yes, there will need to be some unit tests. It may even help for those be written now. That would help us identify things that still need to be done.
Regarding tests, it is never early enough to think about them (that let one write code that is actually "test-able").
Right, and thanks for taking more than a casual interest in this. :-) Cheers, Ron
Ron Adam wrote:
Laurent Gautier wrote:
2007/1/7, Ron Adam <rrr@ronadam.com>:
2007/1/6, Ron Adam <rrr@ronadam.com>: [...] I'd like to know more about using the sandbox, I know it would be easy for
Laurent Gautier wrote: people to read the source there, but who all can have write access to it without having write access to other python areas? I would not mind giving that a try if someone who already knows how could point me to the correct how-to documentation with some advice on what not to do. Limiting where different people can commit code changes is possible... it's just that I am not certain whether sourceforge allows it or not. I asked A.M. Kuchling about that.
I'm not concerned about limiting changes to this project. I want others to work on it. Can write access be *easily* granted to just one cvs sandbox directory, for this project, without granting access to other directories in the sandbox or trunk?
An alternative is just to trust people only to scribble in the sandbox. I really don't think we have a bog security issue here as long as the individuals are "known the the community" (or should that be "known to the PSU"?). After all, it *is* a configuration control system, which by definition can revert to previous content in the event of some unwanted change.
I'm actually more concerned about the what not to do stuff. I really would not like to clobber someone else's work or create problems because of my inexperience with CVS. I see that you are under Microsoft windows, so you may want to check TortoiseSVN. (The python project is stored on a SVN server, so it would make sense to favor this one over CVS - in the case the project administrators have directory-level control -).
Thank's I'll try TortoiseSVN out. :-)
I can heartily recommend it.
Regarding the possibility of jeopardizing something in the repository, the directory-level sandbox should only allow you trash your own work ;-) (but even then, you should always be able to recover from mistakes).
That would be fine then. But I'll let you decide since you are offering to manage getting it set up.
[ ... ] Let's not spend too much time on paranoid administration, since we are supposed to be an open source community :) regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Blog of Note: http://holdenweb.blogspot.com
participants (9)
-
A.M. Kuchling -
Barry Warsaw -
Fernando Perez -
Ka-Ping Yee -
Laurent Gautier -
Phillip J. Eby -
Robert Kern -
Ron Adam -
Steve Holden