
Hi, over the last few weeks I've hacked on a new approach to Python's documentation. As Python already has an excellent documentation framework, the docutils, with a readable yet extendable markup format, reST, I thought that it should be possible to use those instead of the current LaTeX->latex2html toolchain. For the impatient: the result can be seen at <http://pydoc.gbrandl.de>. I've written a converter tool that handles most of the LaTeX markup and turns it into reST, as well as a builder tool that adds many custom directives and roles, and also features like index generation and cross-document linking. (What you can see at the URL is a completely statical version of the docs, as it would be shipped with the distribution. For the real online docs, I have more plans; I'll come to that later.) So why the effort? Here's a partial list of things that have already been improved: - the source is much more readable (for examples, try the "view source" links in the left navbar) - all function/class/module names are properly cross-referenced - the HTML pages are generated from templates, using a language similar to Django's template language - Python and C code snippets are syntax highlighted - for the offline version, there's a JavaScript enabled search function - the index is generated over all the documentation, making it easier to find stuff you need - the toolchain is pure Python, therefore can easily be shipped What more? If there is support for this approach, I have plans for things that can be added to the online version: - a "quick-dispatch" function: e.g., docs.python.org/q?os.path.split would redirect you to the matching location. - "interactive patching": provide an "propose edit" link, leading to a Wiki-like page where you can edit the source. From the result, a diff is generated, which can be accepted, edited or rejected by the development team. This is even more straightforward than plain old comments. - the same infrastructure could be used for developers, with automatic checkin into subversion. - of course, plain old comments can be useful too. Concluding, a small caveat: The conversion/build tools are, of course, not complete yet. There are a number of XXX comments in the text, most of them indicate that the converter could not handle a situation -- that would have to be corrected once after conversion is done. Waiting for comments! Cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
Very cool! I'd love to see the docs move to ReST.
Yes, these would all be outstanding features. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Georg Brandl <g.brandl@gmx.net> wrote:
I'm generally a curmudgeon when it comes to 'the docs could be done better'. But this? I like it. A lot. Especially if you can get these other features in:
I'm a bit iffy on yet another tool, but if roundup integration could be done, I think it would be great. - Josiah

On Sat, May 19, 2007 at 10:48:29AM -0700, Josiah Carlson wrote:
Seconded! -- even if it's just for modules, this would be great. I can't count the times I've wished I could type e.g., 'docs.python.org/httplib' the way I can type 'php.net/array_search' to try to find out whether the needle comes before or after the haystack. Dustin

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
[snip]
Waiting for comments!
Awesome, Georg! Wow. Nice work. Seems like this has been a long time comin', and I bet others have been working away "in secret" on similar projects. I hope you keep running with it until it gets hijacked into being the "official" versions. :) I'm bookmarking it as "python docs" in my browser. BTW, would like to see a little blurb of your own on that page about how the docs were converted, rendered, and their new source format. Thanks much, ---John P.S. -- funny sig, btw. :)

Georg Brandl wrote:
Awesome, that looks pretty amazing! I'd reeeally like to have a look at the source code (don't worry if it's not clean!). Can you publish or post it somewhere? If you'd like to store it in the Docutils sandboxes, just drop me a line and I'll give you SVN access. By the way, things get a lot easier for me if you place it in the public domain, because that's the license Docutils uses, and it's obviously compatible to every other license. I actually have a Google Summer of Code project, "Documenting Python Packages with Docutils", which I'll start working on May 28: <http://code.google.com/soc/psf/appinfo.html?csaid=8D04C53750906F50>. It has a somewhat different scope, so our projects will complement each other nicely I believe. (To the point where we may end up with a complete tool-chain to actually migrate the Python documentation to reST. Très cool.) Your effort and mine only seem to have some limited overlap. I see that you added at least some markup to reST that allows documents to be marked up in a similar fashion as the current Python-specific LaTeX markup, which is on my list, too. If you see more overlap, please let me know, because I may need to adjust my time-line or project-scope (which is totally fine with me, by the way, so don't worry about "getting into the way of my project" or so!). May I suggest we continue the discussion on Doc-SIG only and prune Python-dev from the CC? I'm on Jabber/GoogleTalk (LeWiemann@gmail.com), by the way, so feel free to IM me. Best wishes, Lea [Rest of the quoted message below.]

Georg Brandl wrote:
Wow, very nice. I like it.. +1 I've been working on improving pydoc. (slowly but steadily) Maybe we can join efforts as I think the two projects overlap in a number of areas, and it sounds like we are thinking of some of the same things as far as the tool chain. So maybe there's some synergy we can take advantage of. Some of the things I've recently considered adding to pydoc. - To output individual sections for use in a template engine. - A reST formatter. - Use docutils to format reST doc strings to html in the html formatter. (as an option, not the default.) It looks like there may be a few areas where we can share code. - The html syntax highlighters. (Pydoc can use those) - A shared html style sheet. - Document locater. [1] - An HTMLServer for local (dynamic dispatching) html viewing. [2] - The reST source for viewing topics and keywords in pydoc. (Instead of scraping html pages. Ick) (1.) Pydoc has a locater function which finds the html docs and presents a link to the html page for an individual item. But it would be more reliable if the dispatcher where on the document end. Then pydoc would have a single place to link to. (Either locally or on line.) (2.) The server in pydoc will probably work as is. You just need to supply a callback to get the pages. It's a separate module now. Cheers, Ron

On Sat, May 19, 2007 at 03:31:59PM -0500, Ron Adam wrote:
- The html syntax highlighters. (Pydoc can use those)
I have a patch on the docutils patch tracker that does this. Code is probably of a rather bad quality, but it outputs LaTeX and HTML. If we can work together to improve this patch and get it in docutils it will avoid having different syntaxes and behavior depending on the front-end to docutils being used (I am thinking of rest2web, trac, and I am probably forgetting some others). The patch has been sitting there for almost 6 months without review, but I have that if people other than me work on it and ask for review it will both improve, and get reviewed, and eventually get in ! Sorry for the shameless plug, but I really do think we need a unifying approach to this. Gaël

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
From a doc writer's perspective I find this reST approach much easier to grapple than the LaTeX one since I find reST markup nicer for simple things
I really want this! like lists and bolding. From a committer's POV I like this as it should hopefully get more people to help with changes and make it easy for me to build the docs locally to make sure the markup is correct. And from a lazy coder's POV I love it as Georg has already done all the work (and in Python so if I really have to change something I have a better chance of figuring out how). -Brett

On Sat, 19 May 2007, Georg Brandl wrote:
For the impatient: the result can be seen at <http://pydoc.gbrandl.de>.
This is extremely impressive. Thank you for this work! If all the documentation is generated from a base format that is closer to text (reST instead of LaTeX), that will make it easier for volunteers to read diffs, make edits, and contribute patches. I agree that interactivity (online commenting and editing) will be a huge advantage. I could imagine this heading in a Wiki-like direction -- where a particular version is tagged as the official revision shipped with a particular Python release, but everyone can make edits online to yield new versions, eventually yielding the revision that will be released with the next Python release. -- ?!ng

Georg Brandl wrote:
Very impressive. I should say that although in the past I have argued strongly against the use of reST as a markup language for source-code comments (because the reST language only indicates presentation, not semantics), I am 100% supportive of the use of reST in reference documents such as these, especially considering that LaTeX is also a presentational markup (at least, that's the way it tends to be used.) I know that for myself, LaTeX has been a barrier to contributing to the Python documentation, and reST would be much less of a barrier. In fact, I have considered in the past asking whether or not the Python documentation could be migrated to a format with wider fluency, but I never actually posted on this issue because I was afraid that the answer would be that it's too hard / too late to do anything about it. I am glad to have been proven wrong. -- Talin

Hi Georg Super impressive work! :-) I haven't looked at it in depth yet, but I have a question. One concern from a long thread on Doc-Sig a long time ago, is that ReST did not at the time possess the ability to nicely markup the objects as LaTeX macros do. Is your transformation losing markup information from the original docs? e.g. are you still marking classes as classes and functions as functions in the ReST source, or is it converting from qualified markup to "style" markup (e.g., to generic literals instead of class/function/variable/keyword argument docutils roles, etc.). If you solved that problem, how did you solve it? Is the resulting ReST pretty? Do you think we can build a better index? My beef with using ReST for documentation, as much as I like ReST, is that unless we have roles and structure for declaring functions, classes, etc. it would remain inferior to the LaTeX macros, which in spite of being LaTeX, qualify the kinds of objects to some extent. Wow, it looks amazingly good. Amazing work. Very impressed. (Somewhat related, but another idea from back then, which was never implemented IMO, was to find a way to automatically pull and convert the docstrings from the source code into the documentation, thus unifying all the information in one place.) On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:

On 5/19/07, Martin Blais <blais@furius.ca> wrote:
Looking at http://pydoc.gbrandl.de/modules/collections.txt, I can see it has markup like:: .. class:: deque([iterable]) Returns a new deque object initialized left-to-right (using :meth:`append()`) with data from `iterable`. If `iterable` is not specified, the new deque is empty. .. method:: deque.append(x) Add `x` to the right side of the deque. So he's clearly got some of the info in there with things like ``.. class::`` and ``:meth:``. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Martin Blais wrote:
e.g. are you still marking classes as classes and functions as functions in the ReST source
It seems so (modulo XXX's and TODO's in Georg's implementation, probably ^_^) -- all of the pages have "show source" links, so you can see for yourself. I'm not an expert with the documentation system, but the markup on <http://pydoc.gbrandl.de/modules/codecs.txt> looks pretty complete to me.
While it's probably not possible to simply generate the documentation from the docstrings, it would certainly seem interesting to get have some means (like a directive) to pull docstrings into the documentation. I think however that while migrating the docs do reStructuredText is comparatively straightforward [1]_, pulling documentation from the docstrings will require quite a bit of design and discussion work. So I'd suggest we postpone this idea until we have a working documentation system in reStructuredText, so we don't clutter the discussion. .. [1] I'm sure there will still be quite a few issues to sort out that I'm simply not seeing right now. Best wishes, Lea

Georg Brandl <g.brandl <at> gmx.net> writes:
For the impatient: the result can be seen at <http://pydoc.gbrandl.de>.
- the toolchain is pure Python, therefore can easily be shipped
Very nice! As well as looking very attractive and professional, the all-Python toolset should make it easier to build the documentation - I've not been able to get a trouble-free setup of the docs toolchain on Windows. Thanks for this, Vinay Sajip

Vinay Sajip schrieb:
BTW, I have to give lots of credit for the looks to Armin Ronacher. I'm not so much of a designer ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

[warning: bulk answer ahead] First, thanks for all those nice comments! [John Gabriele]
BTW, would like to see a little blurb of your own on that page about how the docs were converted, rendered, and their new source format.
Sure. I've already written part of the new "Documenting Python" docs, which cover this a bit. The "About the documentation" will be rewritten too. [Lea Wiemann]
The toolset is now in the Docutils sandbox at <http://svn.berlios.de/svnroot/repos/docutils/trunk/sandbox/py-rest-doc>.
Great! Making the new toolset usable for third-party developers is certainly a good option. I saw quite a few using the LaTeX-based tools too.. [Ron Adam]
Certainly there's plenty of overlap.
It looks like there may be a few areas where we can share code.
- The html syntax highlighters. (Pydoc can use those)
The highlighting is actually done with Pygments, which cannot be included in the stdlib as-is. Perhaps a stripped-down version?
Yes, that makes sense. If you want to coordinate efforts, feel free to contact me at Jabber <gbrandl@pocoo.org>. [Ka-Ping Yee]
I agree that interactivity (online commenting and editing) will be a huge advantage.
Yes. I think that always only the latest version should be "publicly" interactive. Old archived doc versions should be static only.
As Steven said, it's solved quite nicely with interpreted text roles. Whether ":class:`Foo`" is nicer than "\class{Foo}" is not entirely clear ;) but you actually get more now, since if a class "Foo" is found in the namespace, it will be cross-linked automatically. About the index: I didn't do anything about it. I just transferred the LaTeX commands into reST directives, the rest is generated completely analogous.
Yep. As it is now, you need three packages from the Cheese Shop: Docutils, Pygments (the highlighter) and Jinja (the templating engine). This shouldn't be problematic, though they could also be stripped down and included. Cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 5/20/07, Georg Brandl <g.brandl@gmx.net> wrote:
This is great. IMHO if this is to compete to become the official Python docs, I would argue for even less dependencies, even at the cost of more generic/bland output, for portability reasons and to stimulate greater adoption. If we can make some of those dependencies optional and only rely on docutils, that could make it ubiquitous. Another thing to keep in mind: I don't know if the directives you defined are very generic, but if they are, it would be interesting to consider migrating them up into docutils (if it makes sense), and see if they could support documenting other programming languages. Could this be a language-independent documenting toolkit? Could we document LISP or Ruby code with it? Georg, thanks again!

Could this be a language-independent documenting toolkit? Could we document LISP or Ruby code with it?
Might want to look at "noweb", http://www.eecs.harvard.edu/~nr/noweb/: ``...noweb works ``out of the box'' with any programming language, and supports TeX, latex, HTML, and troff back ends.'' Bill

[warning: bulk answer ahead] First, thanks for all those nice comments! [John Gabriele]
BTW, would like to see a little blurb of your own on that page about how the docs were converted, rendered, and their new source format.
Sure. I've already written part of the new "Documenting Python" docs, which cover this a bit. The "About the documentation" will be rewritten too. [Lea Wiemann]
The toolset is now in the Docutils sandbox at <http://svn.berlios.de/svnroot/repos/docutils/trunk/sandbox/py-rest-doc>.
Great! Making the new toolset usable for third-party developers is certainly a good option. I saw quite a few using the LaTeX-based tools too.. [Ron Adam]
Certainly there's plenty of overlap.
It looks like there may be a few areas where we can share code.
- The html syntax highlighters. (Pydoc can use those)
The highlighting is actually done with Pygments, which cannot be included in the stdlib as-is. Perhaps a stripped-down version?
Yes, that makes sense. If you want to coordinate efforts, feel free to contact me at Jabber <gbrandl@pocoo.org>. [Ka-Ping Yee]
I agree that interactivity (online commenting and editing) will be a huge advantage.
Yes. I think that always only the latest version should be "publicly" interactive. Old archived doc versions should be static only. [Martin Blais]
As Steven said, it's solved quite nicely with interpreted text roles. Whether ":class:`Foo`" is nicer than "\class{Foo}" is not entirely clear ;) but you actually get more now, since if a class "Foo" is found in the namespace, it will be cross-linked automatically. About the index: I didn't do anything about it. I just transferred the LaTeX commands into reST directives, the rest is generated completely analogous.
Yep. As it is now, you need three packages from the Cheese Shop: Docutils, Pygments (the highlighter) and Jinja (the templating engine). This shouldn't be problematic, though they could also be stripped down and included. Cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

[Georg Brandl]
The highlighting is actually done with Pygments, which cannot be included in the stdlib as-is. Perhaps a stripped-down version?
No need to; we can just fall back to no syntax highlighting if Pygments is not installed on the user's system. [Gael Varoquaux]
- The html syntax highlighters. (Pydoc can use those)
I have a patch on the docutils patch tracker that does this.
For everyone's reference, <http://sf.net/tracker/index.php?func=detail&aid=1595345&group_id=38414&atid=422032>. Best wishes, Lea

Sounds very interesting. I just have one concern/question. I hope that while moving away from latex, we are not precluding the ability to write math as part of the documentation. What would be my choices for add math to the documentation? Hopefully using latex, since there really isn't AFAIK any other competitor for this.

Neal Becker wrote:
Where in the current documentation is there any math notation /at all/? In all my reading of it, I have not run across anything that appeared like it was being used. Besides that question, is the full power of LaTeX math notation really necessary here? I somehow doubt anything more than simple expressions of runtime performance and container behaviors are appropriate for any documentation we have. -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

Scott Dial schrieb:
There is exactly one instance of LaTeX math in the whole docs, it's in the description of audioop, AFAIR, an contains a sum over square roots... So, that's not really a concern of mine ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Neal Becker wrote:
I don't think so. The issue with numpy is getting our act together and making parseable docstrings for auto-generated API documentation using existing tools or slightly modified versions thereof. No one is actually contemplating building a new tool. AFAICT, Georg's (excellent) work doesn't address that use. I don't think there is anything to coordinate, here. Provided that Georg's system doesn't place too many restrictions on the reST it handles, we could use the available reST math options if we wanted to use Georg's system. I'd much rather see Georg spend his time working on the docs for the Python language and the feature set it requires. If the numpy project needs to extend that feature set, we'll provide the manpower ourselves. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Robert Kern schrieb:
Indeed, I don't intend to do anything about docstrings. IMO, docs automatically generated from docstrings can work, but only if there's a single consistent style applied, and the whole thing is written in a markup language, of course, not text only. This is not the case for the Python standard library, so converting it is not an option; in any case, putting all information that is available in the docs into the docstrings would make many modules much less readable.
Of course, for numpy math is much more of importance than for the core. I'm sure the docutils developers will be supportive in case someone volunteers to create/improve reST math capabilities. cheers, Georg

>>> What would be my choices for add math to the documentation? >> Where in the current documentation is there any math notation /at >> all/? Georg> There is exactly one instance of LaTeX math in the whole docs, Georg> it's in the description of audioop, AFAIR, an contains a sum over Georg> square roots... Georg> So, that's not really a concern of mine ;) You must realize that people will use the core tools to create documentation for third party packages which aren't in the core. If you replace LaTeX with something else I think you need to keep math in mind whether it's used in the core documentation or not. Skip

I disagree. The documentation infrastructure of Python should only consider the needs of Python itself. If other people can use that infrastructure for other purposes, fine - if they find that it does not meet their needs, they have to look elsewhere. We are developing a programming language here, not a typesetting system. Regards, Martin

On 5/20/07, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Martin beat me to my comment. =) Python's needs should come first, period. If Georg wants to add math support, fine. But honestly I would rather he spend his time on Python-specific stuff then get bogged down to support possible third parties. -Brett

I would agree with the point that python core should be considered first, but I would also only see beneficial to leave the door open to the need of other packages. I have (briefly but intensely) worked on a revamp of pydoc earlier on this year, and while collecting requirements from a number of places having maths expressions or else appeared important for a number of cases (and a very reasonable request in a case) . That particular point leads to something that I see important for what a new/better documentation system should provide: a good and modular interface to access the documentation, process it, and navigate it. When looking at the particular example discussed here, it could be implemented by allowing a "pluggable" processing components for docstrings (and let a given package developer the possibility to use as much as the default documentation machinery as possible and implement the processing mathml, latex, whatever, as wanted). One can consider the possibility to have the "custom" processing of the docstring embedded in the package itself. Laurent 2007/5/21, Brett Cannon <brett@python.org>:

Brett> Martin beat me to my comment. =) Python's needs should come Brett> first, period. If Georg wants to add math support, fine. But Brett> honestly I would rather he spend his time on Python-specific Brett> stuff then get bogged down to support possible third parties. I think the people who have responded to my comment read too much into it. Nowhere do I think I asked Georg to write an equation typesetter to include in the Python documentation toolchain. I asked that math capability be considered. I have no idea what tools he used to build his new documentation set. I only briefly glanced at a couple of the output pages. I think what he has done is marvelous. However, I don't think the door should be shut on equation display. Is there a route to it based on the tools Georg is using? If not, then I think some accommodation should be made. I'm being vague here on purpose because I'm unfamiliar with the available tools. The one thing I do know is that LaTeX provides that today and by removing it from the toolchain you have removed a significant piece of functionality. Skip

skip@pobox.com schrieb:
And that is reasonable, of course.
In the end, it all depends on what kind of support basic reST can deliver. IMO, you still get the best math output from LaTeX, but I don't really know many other things. That is also something I want to convey: I'm very fond of LaTeX, and use it regularly for all my University work. For the Python docs, however, I can see many advantages of the docutils approach.
That's the point I see differently: for the Python core docs, it's not significant, and my efforts are primarily limited to that area. cheers, Georg

>> You must realize that people will use the core tools to create >> documentation for third party packages which aren't in the core. If >> you replace LaTeX with something else I think you need to keep math >> in mind whether it's used in the core documentation or not. Martin> I disagree. The documentation infrastructure of Python should Martin> only consider the needs of Python itself. If other people can Martin> use that infrastructure for other purposes, fine - if they find Martin> that it does not meet their needs, they have to look elsewhere. Then I submit that you are probably removing some significant piece of functionality from the provided documentation toolchain which some people probably rely on. After all, that's what LaTeX excels at. They will be able to continue to use the old tools, but where will they get them if they are no longer part of Python? Skip

Fred L. Drake, Jr. schrieb:
That is a good idea! The converter is not likely to work with other projects out of the box (it's been finetuned for the Doc/ sources), and it's not clear whether they would want to switch in any case. Many of the features that the new system would be able to provide aren't needed for them anyway, and I, as a maintainer, would be very reluctant to put extra work in that too... cheers, Georg

On Mon, May 21, 2007 at 09:23:47AM -0400, Fred L. Drake, Jr. wrote:
That seems like a straightforward task. The big stumbling block in switching away from LaTeX has always been the effort of making a good conversion; if Georg's work does 80% of the job, we should definitely take advantage of the opportunity and try to switch. Advantages of reST: * The required tool chain shrinks (at least if you're not making printed output, which will probably still go through LaTeX). * Tool chain is now more easily scriptable, and it'll be easier to make use of the docs from Python code. * We can produce XML output through the rst2xml script. Disadvantages: * reST markup isn't much simpler than LaTeX. --amk

Hoi, Fred L. Drake, Jr. <fdrake <at> acm.org> writes:
For a lightweight markup language that is human readable (which rst certainly is) the syntax is surprisingly powerful. You can nest any block tag and I'm not sure how often you have to nest roles and stuff like that. The goal of the new docs is a less complex syntax and currently nothing beats reStructuredText in terms of readability and possibilities. rst is simpler than latex: LaTeX: \item[\code{*?}, \code{+?}, \code{??}] The \character{*}, \character{+}, and \character{?} qualifiers are all \dfn{greedy}; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE \regexp{<.*>} is matched against \code{'<H1>title</H1>'}, it will match the entire string, and not just \code{'<H1>'}. Adding \character{?} after the qualifier makes it perform the match in \dfn{non-greedy} or \dfn{minimal} fashion; as \emph{few} characters as possible will be matched. Using \regexp{.*?} in the previous expression will match only \code{'<H1>'}. Here the same in rst: ``*?``, ``+?``, ``??`` The ``'\*'``, ``'+'``, and ``'?'`` qualifiers are all :dfn:`greedy`; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE :regexp:`<.\*>` is matched against ``'<H1>title</H1>'``, it will match the entire string, and not just ``'<H1>'``. Adding ``'?'`` after the qualifier makes it perform the match in :dfn:`non-greedy` or :dfn:`minimal` fashion; as *few* characters as possible will be matched. Using :regexp:`.\*?` in the previous expression will match only ``'<H1>'``. Regards, Armin

Armin Ronacher writes:
IMO that pair of examples shows clearly that, in this application, reST is not an improvement over LaTeX in terms of readability/ writability of source. It's probably not worse, although I can't help muttering "EIBTI". In particular I find the "``'...'``" construct horribly unreadable because it makes it hard to find the Python syntax in all the reST. I don't think that's an argument against switching to reST, though. Georg's site speaks for itself. Kudos!

In your examples, I think the ReST version can be cleaned up quite a bit. First by using the .. default-role:: literal directive so that you can type `foo()` instead of using double back quotes and then you can remove the redundant semantic markup. Like this: `\*?`, `+?`, `??` The "`*`", "`+`" and "`?`" qualifiers are all *greedy*; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE `<.*>` is matched against `'<H1>title</H1>'`, it will match the entire string, and not just `'<H1>'`. Adding "`?`" after the qualifier makes it perform the match in *non-greedy* or *minimal* fashion; as *few* characters as possible will be matched. Using `.*?` in the previous expression will match only `'<H1>'`. The above is the most readable version. For example, semantic markup like :regexp:`<.\*>` doesn't serve any useful purpose. The end result is that the text is typesetted with a fixed-width font, no matter if you prepend :regexp: to it or not. -- mvh Björn

It would appear that while we slept Jens Mortensen was busy at work on his rst2{latex,latexmath,mathml}.py scripts: http://docutils.sourceforge.net/sandbox/jensj/latex_math/ Note the date on the files. It seems to work pretty well, and as others have pointed out, LaTeX notation is probably more familiar to people interested in math display than anything else. Skip

Neal> I know almost nothing about docutils internals. How do I Neal> 'install' the above? Me either. Here's what I did: * download and expand the latest docutils snapshot * replicate Jens's work in a directory called "math" at the top level of the docutils directory * edited setup.py to get them installed: diff -u setup.py.~1~ setup.py --- setup.py.~1~ 2007-03-21 18:45:38.000000000 -0500 +++ setup.py 2007-05-22 07:07:25.000000000 -0500 @@ -115,6 +115,9 @@ 'tools/rst2xml.py', 'tools/rst2pseudoxml.py', 'tools/rstpep2html.py', + 'math/tools/rst2latex.py', + 'math/tools/rst2latexmath.py', + 'math/tools/rst2mathml.py', ],} """Distutils setup parameters.""" * ran "python setup.py install" That's probably more than necessary, but with the math subdir I can easily move the whole thing to a new snapshot and the setup.py change lets me install them transparently. Skip

Stephen J. Turnbull wrote:
It's interesting how perceptions can differ - I find heavily marked up latex tends to blur into a huge wall of text when I try to read it because of all of the {} and \ characters everywhere. With reST, on the other hand, I find the use of the relatively 'light' backquote and colon characters to delineate the markup breaks things up sufficiently that I can easily ignore the markup in order to read what has actually been written. So in Armin's example, I found the reST version *much* easier to read. Whether that difference in perception is due simply to my relative lack of experience in using LaTeX, or to something else, I have no idea. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On 5/22/07, Nick Coghlan <ncoghlan@gmail.com> wrote:
- If you make a mistake in LaTeX, you will get a cryptic error which is usually a little difficult to figure out (if you're not used to it). You can an error though. - If you make a mistake in ReST, you will often get no warning nor error, and not the desired output. If you were to use the amount of markup in that example, you would have to check your text with rst2xml frequently to make sure it groks what you're trying to say. (And I've been there: I wrote an entire project who relies specifically on this, on precise structures generated by docutils (http://furius.ca/nabu/). It's *very* easy to make subtle errors that generate something else than what you want.) ReST works well only when there is little markup. Writing code documentation generally requires a lot of markup, you want to make variables, classes, functions, parameters, constants, etc.. (A better avenue IMHO would be to augment docutils with some code to automatically figure out the syntax of functions, parameters, classes, etc., i.e., less markup, and if we do this in Python we may be able to use introspection. This is a challenge, however, I don't know if it can be done at all.)

Martin Blais schrieb:
That is correct, but can be helped with nice preview features.
While writing the converter, I stumbled about a few locations where the LaTeX markup cannot be completely converted into reST, and a few locations where invalid reST was generated and not warned about. However, both of those problems occurred far less often than I'd anticipated. Georg

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

Raymond Hettinger schrieb:
I'm not saying that LaTeX is hard for most of us, I say that it is *perceived* to be hard by many. And as soon as you dig into the deep support infrastructure, it gets very confusing. Just look at this bug fix I made some time ago: http://mail.python.org/pipermail/python-checkins/2007-April/059637.html
Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs.
ISTM that this is possible with the new markup too. I wrote a great part of the new "Documenting Python" document, and after reading that one should be prepared enough to write reST just as well.
I can't see many differences. If I can translate a \begin{classdesc}{...} environment, I can also translate a ".. class::" directive for my new item.
Are the docs really that complex? I mean, look at the typical source of a converted page. The most common things are "information units", i.e. .. class:: directives, code snippets and plain old text. Cross-references work flawlessly. You may also ask Thomas Heller about documenting Python modules in reST. AFAIR the ctypes docs were written with it and converted to LaTeX afterwards.
Sometimes it's the way you present the ability to change things that affects how many people actually do it. Finding the location that tells you how to suggest changes, and opening a new bug in the infamous SF tracker is not really something people do happily. A "click here to suggest a change" link that leads to a pseudo- edit-form, complete with preview facility, might prove more effective. cheers, Georg

On Tue, May 22, 2007 at 06:13:36PM +0200, Georg Brandl wrote:
Indeed. I know my instinctive reaction to the Python docs is "oh, this is not something which the public can contribute to". Something more like the PHP-style "public annotations" might be good - with an appropriate moderation / voting system on the annotations it could possibly be very good.

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

On 5/22/07, Martin Blais <blais@furius.ca> wrote:
Just to follow-up on that idea: I don't think it would be very difficult to write a very small modification to docutils that interprets the default role with more "smarts", for example, you can all guess what the types of these are about: `class Foo` (this is a class Foo) `bar(a, b, c) -> str` (this is a function "bar" which returns a string) `name (attribute)` (this is an attribute) ...so why couldn't the computer solve that problem for you? I'm sure we could make it happen. Essentially, what is missing from ReST is "less markup for documenting programs". By restricting the problem-set to Python programs, we can go a long way towards making much of this automatic, even without resorting to introspecting the source code that is being documented.

Martin Blais wrote:
I was going to suggest something similar. Ideally, any markup language ought to have a kind of "Huffman Coding" of complexity - in other words, the markup symbols that are used the most frequently are the ones that should be the shortest and easiest to type. Just as in real Huffman Coding, the popularity of a given element is going to depend on context. This would imply that there should be customizations of the markup language for different knowledge domains. While there are some benefits to having a 'standard' markup, any truly universal markup is going to be much heavier and more cumbersome than one that is specialized for the task. I would advocate a system in which the author inserts minimalistic 'hints' into the text, and the document processor uses those hints along with some automatic reasoning to determine the final markup. As in the above example, the use of backticks can be signal to the document processor that the enclosed text should be examined for identifiers and other Python syntax. I would also suggest that one test for evaluating the quality of markup syntax is whether or not it can be learned by example - can a user follow the pattern of some other part of the docs, without having to learn the syntax in a formal way? -- Talin

Talin wrote:
Does this mean it's time for "pyST" -- Python-structured text?-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

Greg Ewing wrote:
I wasn't going to say it :) Now, at the risk of going even further out of the mainstream (actually, there's no risk, it's a dead certainty), if I had been clever enough to think that I could write a LaTeX translator, I probably would have made my target language Docbook or some other flavor of XML. Now, you might argue that XML is more cumbersome and harder to author than reST, and that is certainly a valid argument. On the other hand, there are a couple of interesting advantages to using XML: 1) You get an instant WYSIWYG preview capability by publishing a standard CSS stylesheet along with the docs. Anyone would be able to see what the output would look like merely by viewing it in a browser. While there would be some document transformations which would be not be previewable in CSS (such as breaking the document up into hyperlinked chapters), you would at least be able to see enough to be able to do a decent job of editing the text without having to install any special tools. And some of those more difficult transformations would be doable with a suitable XSTL stylesheet, which can be directly executed in most browsers. (As an example, I once wrote an XSLT stylesheet that converted OpenDocument XML into the equivalent HTML - this was part of my Firefox ODFReader plugin [http://www.alcoholicsunanimous.com/odfreader/], that allowed ODF documents to be directly viewed in the browser without having to launch an external helper application.) 2) There are a few WYSIWYG XML editors out there, which allow you to edit the styled text directly in an editor (although I don't know of any open source ones.) 3) The document processing tool could be very minimal, mostly assembled out of standard modules for processing XML. 4) XML has a well-specified method of escaping into other (XML-based) languages, which is XML namespaces. So for those who want equations in their docs, they could simply insert a block of MathML inside their Docbook XML. Similarly, illustrations could be embedded using bitmap images or SVG as appropriate. 5) Having XML-based docs would make it easy to write other kinds of processors that operate on the docs in different ways, such as building a keyword index or doing various kinds of analysis. Now, this suggestion of using XML isn't really a serious one. But I think that the various advantages that I have listed ought to be considered when thinking about how the tool chain for python documentations should operate. I think that there is a big advantage to making the document processing tools simple and hosted entirely in Python. People who contribute to the docs are likely to know quite a bit about Python, but it is far from certain what else they might know. And tools written in Python are automatically able to run in diverse environments, which may not be the case for tools written in other languages. This means that tools that are in Python are more likely to be used, and further, they are more likely to be improved or specialized to the task by those who use them. In terms of authoring, the convenience of the markup language is only one factor; A bigger factor I think is having a short feedback cycle between edit and test, where 'test' means seeing what your written text would look like in the finished product. The quicker you can make that feedback loop, the more likely people will be to work on the docs. -- Talin

On 5/24/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Not before someone writes it. Georg Brandl's awesome ReST based system has the nice property that it actually exists and works. For a great number of reasons it is superior to the existing LaTeX based system, and I hope and think that they are strong enough to replace LaTeX with it. -- mvh Björn

Talin schrieb:
What's better here than :class:`Foo` or :attr:`name`? You wouldn't want to put an " (attribute)" after all references to it in your text, so this is just an alternative way to spell roles.
What I could propose is that we could abandon :class:`foo`, :meth:`foo` etc. and just use `foo`. There shouldn't be too many cases where this gets ambiguous crossreferencing. Variables would just use *var*, since they're not marked up speciall anyways.
I think he/she can, given a piece of document that contains most of the needed markup constructs. You'll pretty soon grok that reST uses indentation (and you'll be pleased with it if you like Python, which seems a reasonable assumption ;). You'll also get that :foo:`bar` is the syntax for semantic inline markup. Code examples are always introduced with a "::" (okay, the exact rules are a bit nifty, but very convenient if you know them). What else do you need to know? ".. function::" directives are quite easy to recognize. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 5/22/07, Armin Ronacher <armin.ronacher@active-4.com> wrote:
About possibilities: I'm sorry but that is simply not true. LaTeX provides the full power of macro expansion, while ReST is a fixed (almost, roles are an exception) syntax which has its own set of problems and ambiguities. It is far from perfect, and definitely does not have the flexibility that LaTeX provides. Some of the syntax cannot be nested (try to combine ** with literals).
rst is simpler than latex:
That, and the ability to already parse it from Python and more easily convert to other formats (one of LaTeX's weaknesses), are the only benefits that I can see to switching away from LaTeX. I have to admit I'm afraid we would be moving to a lesser technology, and the driver for that seems to be people's reluctance to work with the more powerful, more complex tool. Not saying it is invalid (it's about people, in the end), but I still don't see what's the big problem with LaTeX.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On May 22, 2007, at 10:37 AM, Martin Blais wrote:
I'm a fan of LaTeX (and latex, er, oops :) too, but what appeals to me most about moving to reST is that the tool chain simplifies considerably. Even with a nice distro packaging system it can be a PITA to get all the tools you need to build the documentation properly installed. A pure-Python solution, even a lesser one, would be a win if we can still produce top quality online and written documentation from one source. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRlMC7XEjvBPtnXfVAQJoFQQAjvYsXamif459t34X4Bn00G0S1b3qeM1Y PhwdAC5cuCpMoopVl+9vtjjcP4Np9P0buY09H+mLwv0nAZRNF7HT3xDr/U65FiX+ Aa7B9+3jVqRGg1+R6oYRKuPUmcLrBFESy6thKkw9audVsT5jgpBM9m9Y405QSIEU MvK7hYrYBqQ= =Jdbt -----END PGP SIGNATURE-----

On Tuesday 22 May 2007, Barry Warsaw wrote:
The biggest potential wins I see for a new system are: - more contributions - platform-independent processing I remain sceptical on being able to achieve the first, but there some hope for it. The later should make things easier for people who are willing to put the work into contribution, which is valuable in its own right. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

Fred> The biggest potential wins I see for a new system are: Fred> - more contributions Fred> - platform-independent processing Fred> I remain sceptical on being able to achieve the first, but there Fred> some hope for it. You at least take away a common excuse for lack of contributions. True whiners will just come up with new ones (e.g., "the documentation isn't available in Sanskrit yet" or "the dog ate my changes before I could type them into the computer"). ;-) Skip

skip@pobox.com wrote:
But doesn't *everyone* now know that documentation contributions don't have to be marked up? It's certainly been said enough. Maybe that fact should be more prominent in the documentation? Then we'll just have to worry about getting people to read it ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden ------------------ Asciimercial --------------------- Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.com squidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -------------- Thank You for Reading ----------------

>> You at least take away a common excuse for lack of contributions. >> True whiners will just come up with new ones (e.g., "the >> documentation isn't available in Sanskrit yet" or "the dog ate my >> changes before I could type them into the computer"). ;-) Steve> But doesn't *everyone* now know that documentation contributions Steve> don't have to be marked up? It's certainly been said Steve> enough. Sure, but that doesn't stop the true whiners. ;-) Skip

On Tue, May 22, 2007 at 11:45:04AM -0500, skip@pobox.com wrote: -> -> >> You at least take away a common excuse for lack of contributions. -> >> True whiners will just come up with new ones (e.g., "the -> >> documentation isn't available in Sanskrit yet" or "the dog ate my -> >> changes before I could type them into the computer"). ;-) -> -> Steve> But doesn't *everyone* now know that documentation contributions -> Steve> don't have to be marked up? It's certainly been said -> Steve> enough. -> -> Sure, but that doesn't stop the true whiners. ;-) Nothing stops the true whiners ;). I think new and exciting ways of viewing, searching, annotating, linking to/from, and indexing the docs are more important than new formats for (not) writing the docs. For example, this rocks! :: http://pydoc.gbrandl.de/search.html?q=os.path&area=default There have been (good) efforts to wikify the docs in the past. IMO what would make them really work would be to have docs.python.org, the "official" Python docs location, start hosting these efforts. As long as that location is static, I think the majority of users will ignore other efforts. cheers, --titus p.s. Are there good directions for installing the toolchain for current docs building anywhere? I've tried once or twice, but despite a lot of LaTeX experience I could never get everything hooked up right.

Titus Brown wrote:
It would be more impressive if the search string returned hits ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden ------------------ Asciimercial --------------------- Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.com squidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -------------- Thank You for Reading ----------------

Steve Holden schrieb:
This is a JavaScript based search, which will only be (optionally) integrated in the offline version. The online version will get a more sophisticated search. We've just finished to implement the quick dispatcher in the repository. A request to http://docs.python.org/os.path would then redirect to the appropriate module page, as well as http://docs.python.org/?q=os.path Something like http://docs.python.org/?q=os.paht (note the misspelling) would lead to a page with close matching results, os.path being the first of them. (The web app is based on wsgiref, with a few wrappers around it...) cheers, Georg

On Tue, 22 May 2007, Steve Holden wrote:
I think the issue is instant gratification. You don't get the satisfaction of seeing your changes unless you're willing to write them in LaTeX, and that's a pretty big barrier -- a lot of what motivates open source volunteers is the sense of fulfillment. (Hence, by the same nature, Wiki-like editing with instant changes visible online will probably greatly increase contributions.) -- ?!ng

We are developing a programming language here, not a typesetting system.
Good point, Martin. Are you implying that the documentation should be kept in LaTeX, a widely-accepted widely-disseminated stable documentation language, which someone else maintains, rather than ReST, which elements of the the Python community maintain? Bill

Bill Janssen schrieb:
No - I have no particular preference wrt. to the markup language. I can personally live with all of them, and I like none of them. I hear that contributors complain about having to use TeX, and I hear other people say that they were more happy if they could use ReST instead of TeX. Making contributors happy is really what the objective should be (if the quality of the typesetting output is adequate - and most people use the HTML output these days, where latex2html may not have adequate quality). That docutils happens to be written in Python should make little difference - it's *not* part of the Python language project, and is just a tool for us, very much like latex and latex2html. Regards, Martin

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On May 21, 2007, at 12:28 AM, Martin v. Löwis wrote:
I would take a fairly broad view of the term "for Python" though. Specifically, third party modules and applications written for and in Python should be explicitly supported. I think we'd like to see one (preferred) documentation tool chain for core Python, plus the vast number of third party modules and apps. I don't see any reason why Georg's reST-based system couldn't provide that (80/20 rule perhaps?). I'd point to for example the howto templates, which can be easily used for third party applications. Mailman uses howtos for example. BTW, Georg excellent work. I'm a big latex fan and long-time user, but I do think that using reST will open the door to a lot more contributions. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRlHzJXEjvBPtnXfVAQLNXQP/QUuZ2gc/DpoidI9jYt7mr66Z+JYHsslv fe4CvFSjd9OxwA3eOynd9dSOSkO6QHQPDVomW8axEkJJSHWNosnr9gmDZcC75nAD JTt4rGImqwkcVIAzE91pZ3fmce/ltp9p1Ru3B1dDRmXbNgHxZ9njaz60MPFszC/H 19jSBo5sqSU= =hlhi -----END PGP SIGNATURE-----

skip@pobox.com wrote:
Perhaps my comment was misunderstood. I have no objection to a new system, and it does not have to be based on latex. I just hope there will be some escape mechanism that allows math. It happens that for math markup, there isn't really anything better (or more familiar) than latex.

Neal> It happens that for math markup, there isn't really anything Neal> better (or more familiar) than latex. True enough. There is MathML and its offspring, ASCIIMathML, which are probably worth looking at. http://www.w3.org/Math/ http://www1.chapman.edu/~jipsen/asciimath.html I have no idea if either one has backend support for presentation outside the web, but if people are interested in this (probably within the docutils scope) they probably should be considered. ASCIIMathML in particular is probably worth using now within even if you can't convert it to any other format. It's about as readable as LaTeX source. Skip

skip@pobox.com wrote:
MathML is the language used for equations in Open Document Format (aka ISO 26300). I don't know what extra typesetting tricks (if any) they wrap around it, though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Please add a link to the PEP index (which is also missing from docs.python.org, though not from python.org/doc/. And consider at least some PEPs as part of the corpus indexed (ie, those with info not in the regular docs). tjr

Georg Brandl <g.brandl@gmx.net> wrote:
Good idea! Latex is a barrier for contribution to the docs. I imagine most people would be much better at contributing to the docs in reST. (Me included: I learnt latex at university a couple of decades ago and have now forgotten it completely!)
- a "quick-dispatch" function: e.g., docs.python.org/q?os.path.split would redirect you to the matching location.
Being a seasoned unix user, I tend to reach for pydoc as my first stab at finding some documentation rather than (after excavating the mouse from under a pile of paper) use a web browser. If you've ever used pydoc you'll know it reads docstrings and for some modules they are great and for others they are sorely lacking. If pydoc could show all this documentation as well I'd be very happy! Maybe your quick dispatch feature could be added to pydoc too?
It is missing conversion of ``comment'' at the moment as I'm sure you know... You will need to make your conversion perfect before you convince the people who wrote most of that documentation I suspect! -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick Craig-Wood schrieb:
It is my intention to work together with Ron Adam to make the pydoc <-> documentation integration as seamless as possible.
Sorry, what did you mean?
You will need to make your conversion perfect before you convince the people who wrote most of that documentation I suspect!
It already is as good as it gets, barring a few bugs here and there. Which I'd like to hear about, when you find them! cheers, Georg

Georg Brandl <g.brandl@gmx.net> wrote:
So I'll be able to read the main docs for a module in a terminal without reaching for the web browser (or info)? That would be great! How would pydoc decide which bit of docs it is going to show? If I type "pydoc re" is it going to give me the rather sparse __doc__ from the re module or the nice reST docs? Or maybe both, one after the other? Or will I have to use a flag to dis-ambiguate? If you type "pydoc re" at the moment then it says in it MODULE DOCS http://www.python.org/doc/current/lib/module-re.html which is pretty much useless to me when ssh-ed in to a linux box half way around the world...
``comment'' produces smart quotes in latex if I remember correctly. You probably want to convert it somehow because it looks a bit odd on the web page as it stands. I'm not sure what the reST replacement might be, but converting it just to "comment" would probably be OK. Likewise with `comment' to 'comment'. For an example see the first paragraph here: http://pydoc.gbrandl.de/reference/index.html -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick> If you type "pydoc re" at the moment then it says in it Nick> MODULE DOCS Nick> http://www.python.org/doc/current/lib/module-re.html Nick> which is pretty much useless to me when ssh-ed in to a linux box Nick> half way around the world... I get quite a bit of information about re (I've never known /F to be a documentation slouch). Only one bit of that information is a reference to the page in the library reference manual. And if I happen to be ssh'd into a machine halfway round the world through a Gnome terminal I can right mouse over that URL and pop the page up in my default local browser. If you set the PYTHONDOCS environment variable you can point it to a local (or at least different) copy of the libref manual. A flag could be added to pydoc to show that content instead, however being html it probably would be difficult to read unless pumped through lynx -dump or something similar. Skip

On Wed, May 23, 2007 at 05:39:38AM -0500, skip@pobox.com wrote:
Yes it is certainly better than no docs. It doesn't for instance have any regexp info, and I can never remember all the special non matching brackets (eg (?:...) so I have to read for the full docs for that.
I take your point. However the unix tradition is that everything is in the man pages. man pages have expanded over the years to include info pages and you *can* read the full python docs via info, it just isn't quite as convenient as pydoc. I think perl had the right idea with perldoc. You can read all the perl documentation whether it is in module documentation (like docstrings) or general documentation (like the latex docs under discussion). I'd like to see pydoc be a viewer for all the python documentation, not just a subset of it.
I'm assuming that we do reST all the python documentation which would make it easier. -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

2007/5/23, Nick Craig-Wood <nick@craig-wood.com>:
One option is to use a text-mode browser (lynx, links, or the likes). The other is to develop a terminal mode application (currently in pydoc, I believe)
I really think that making pydoc a solid library to extract/search/navigate the documentation offers a lot of interesting perspectives. When one think beyond the application discussed here, there are a lot of tools (ipython, or any IDE for example) that could make great use of the facility. [note: Ron and I seemed to disagree on what (and how) pydoc should be, and that in particular, but I keep a keen interest in having such a library.]

Nick Craig-Wood schrieb:
Ahh, now the dime has fallen ;) (sorry, German phrase) Yes, it should probably use Unicode equivalents of these quotes, as it does with en- and em-dashes. There are also nifty "post-processor" filters which operate on complete HTML pages and replace normal quotes by "smart" ones, perhaps that is the way to go. Georg

Georg Brandl wrote:
Ahh, now the dime has fallen ;) (sorry, German phrase)
In English it's "the penny has dropped", so it's not much different. :-) Although I thought dimes were an American thing, and Germans would be more likely to use a different coin. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

Nick Craig-Wood wrote:
In fairness to Georg, latex2html also misses the smart quotes. See the same paragraph here: http://docs.python.org/ref/front.html -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

Nick Craig-Wood wrote:
What latex does here for typeset output is nice, but it's also a bit of a hack job. The ` and ' characters aren't smart, the fonts just have curved glyphs for them. `` and '' are mapped to additional glyphs using ligatures, again part of the font information. The result, of course, is really nice. :-) Scott Dial wrote:
There's a way to make latex2html do "the right thing" for these, except... it then happily does so even to ` and '' (and `` and '') in code samples, since there's no equivalent to the font information used to handle this in latex. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

Nick Craig-Wood wrote:
Pydoc currently gets topic info for some items by scraping the text from document 'local' web pages. This is kind of messy for a couple of reasons. - The documents may not be installed locally. - It can be problematic locating the docs even if they are installed. - They are not formatted well after they are retrieved. I think this is an area for improvement. This feature is also limited to a small list where the word being searched for is a keyword, or a very common topic reference, *and* they are not likely to clash with other module, class, or function names. The introspection help parts of pydoc are completely separate from topic help parts. So replacing this part can be done without much trouble. What the best behavior is and how it should work would need to be discussed. Keep in mind doc strings are meant to be more of a quick reference to an item, and Pydoc is the interface for that.
If retrieval from the full docs is desired, then it will probably need to be disambiguated in some way or be a separate interface. help('re') # Quick reference on 're'. helpdocs('re') # Full documentation for 're'.

On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote:
And it would be improved by converting the docs to reST I imagine.
I think that if reST was an acceptable form for the documentation, and it could be auto included in the main docs from docstrings then you would find more modules completely documented in __doc__.
Actually if it gave both sets of docs quick, then long, one after the other that would suit me fine. -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick Craig-Wood wrote: > On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote: >> Nick Craig-Wood wrote: >>> So I'll be able to read the main docs for a module in a terminal >>> without reaching for the web browser (or info)? That would be great! >>> >>> How would pydoc decide which bit of docs it is going to show? >> Pydoc currently gets topic info for some items by scraping the text from >> document 'local' web pages. This is kind of messy for a couple of reasons. >> - The documents may not be installed locally. >> - It can be problematic locating the docs even if they are installed. >> - They are not formatted well after they are retrieved. >> >> I think this is an area for improvement. > > And it would be improved by converting the docs to reST I imagine. Yes, this will need a reST to html converter for displaying in the html browser. DocUtils provides that, but it's not part of the library. (?) And for console text output, is the unmodified reST suitable, or would it be desired to modify it in some way? Should a subset of the main documents be included with pydoc to avoid the documents not available messages if they are not installed? Or should the topics retrieval code be moved from pydoc to the main document tools so it's installed with the documents. Then that can be maintianed with the documents instead of being maintained with pydoc. Then pydoc will just looks for it, instead of looking for the html pages. >> This feature is also limited to a small list where the word being searched >> for is a keyword, or a very common topic reference, *and* they are not >> likely to clash with other module, class, or function names. >> >> The introspection help parts of pydoc are completely separate from topic >> help parts. So replacing this part can be done without much trouble. What >> the best behavior is and how it should work would need to be discussed. >> >> Keep in mind doc strings are meant to be more of a quick reference to an >> item, and Pydoc is the interface for that. > > I think that if reST was an acceptable form for the documentation, and > it could be auto included in the main docs from docstrings then you > would find more modules completely documented in __doc__. That would be fine for third party modules if they want to do that or if there is not much difference between the two. >>> If I type "pydoc re" is it going to give me the rather sparse __doc__ >> >from the re module or the nice reST docs? Or maybe both, one after >>> the other? Or will I have to use a flag to dis-ambiguate? >> If retrieval from the full docs is desired, then it will probably need to >> be disambiguated in some way or be a separate interface. >> >> help('re') # Quick reference on 're'. >> helpdocs('re') # Full documentation for 're'. > > Actually if it gave both sets of docs quick, then long, one after the > other that would suit me fine. That may work well for the full documentation, but the quick reference wouldn't be a short quick reference any more. I'm attempting to have a pydoc api call that gets a single item or sub-item and format it to a desired output so it can be included in other content. That's makes it possible for the full docs (not necessarily pythons) to embed pydoc output in it if it's desirable. This will need pydoc formatters for the target document type. I hope to include a reST output formatter for pydoc. The help() function is imported from pydoc by site.py when you start python. It may not be difficult to have it as a function that first tries pydoc to get a request, and if the original request is returned unchanged, tries to get information from the full documentation. There could be a way to select one or the other, (or both). But this feature doesn't need to be built into pydoc, or the full documentation. They just need to be able to work together so things like this are possible in an easy to write 4 or 5 line function. (give or take a few lines) So it looks like most of these issues are more a matter of how to organize the interfaces. It turns out that what I've done with pydoc, and what Georg is doing with the main documentation should work together quite nicely. Cheers, Ron

Ron Adam schrieb: > Nick Craig-Wood wrote: > > On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote: > >> Nick Craig-Wood wrote: > >>> So I'll be able to read the main docs for a module in a terminal > >>> without reaching for the web browser (or info)? That would be great! > >>> > >>> How would pydoc decide which bit of docs it is going to show? > >> Pydoc currently gets topic info for some items by scraping the text from > >> document 'local' web pages. This is kind of messy for a couple of reasons. > >> - The documents may not be installed locally. > >> - It can be problematic locating the docs even if they are installed. > >> - They are not formatted well after they are retrieved. > >> > >> I think this is an area for improvement. > > > > And it would be improved by converting the docs to reST I imagine. > > Yes, this will need a reST to html converter for displaying in the html > browser. DocUtils provides that, but it's not part of the library. (?) > > And for console text output, is the unmodified reST suitable, or would it > be desired to modify it in some way? A text writer for docutils should not be hard to write. You'd get something that looks like the reST, but stripped of markup that makes no sense when viewed on a terminal, such as :class:`xyz`. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

This subject is generating a lot of discussion and [almost entirely] positive feedback. It would be a great shame to run out of steam. Does it need a PEP to see a chance of it getting accepted as the formal documentation system? (or a pronouncement that it will never happen...) Michael Foord Georg Brandl wrote: > Ron Adam schrieb: > >> Nick Craig-Wood wrote: >> > On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote: >> >> Nick Craig-Wood wrote: >> >>> So I'll be able to read the main docs for a module in a terminal >> >>> without reaching for the web browser (or info)? That would be great! >> >>> >> >>> How would pydoc decide which bit of docs it is going to show? >> >> Pydoc currently gets topic info for some items by scraping the text from >> >> document 'local' web pages. This is kind of messy for a couple of reasons. >> >> - The documents may not be installed locally. >> >> - It can be problematic locating the docs even if they are installed. >> >> - They are not formatted well after they are retrieved. >> >> >> >> I think this is an area for improvement. >> > >> > And it would be improved by converting the docs to reST I imagine. >> >> Yes, this will need a reST to html converter for displaying in the html >> browser. DocUtils provides that, but it's not part of the library. (?) >> >> And for console text output, is the unmodified reST suitable, or would it >> be desired to modify it in some way? >> > > A text writer for docutils should not be hard to write. You'd get something that > looks like the reST, but stripped of markup that makes no sense when viewed on > a terminal, such as :class:`xyz`. > > Georg > >

Michael Foord schrieb:
No. First of all, it needs a dedicated developer (preferably, but not necessarily a committer) who indicates willingness to maintain that for the coming years and releases. It might be that Fred Drake's offer to maintain the documentation would be still valid after such a switch, but we should not assume so without explicit confirmation. It might be that this would be the time to pass one documentation maintenance to somebody else (and I seriously do not have any one particular in mind here). Then, I think a should be made where the documentation is converted. Again, a volunteer would be needed to create the branch, and then eventually merge it back to the trunk. It might be helpful, but isn't strictly necessary, to close all documentation patches before doing so, as they all break with the conversion. For that activity, multiple volunteers would be useful. I don't think a formal document needs to be written, unless there is a hint of disagreement within the community. In that case, a process PEP would be necessary. However, it is much more important that the documentation maintainer explicitly agrees than that nobody explicitly disagrees, or that a pronouncement is made - the pronouncement alone will *not* cause this change to be carried out. Regards, Martin

Martin v. Löwis schrieb:
Assuming that Fred goes into well-earned retirement from the doc maintainer position (private mail exchange hinted that way), and nobody more qualified steps up, I'd be available to take that post. (If someone else wants to take maintainership of the content, very good, I'd have to be maintainer of the build tools anyway.) I'd then try to form a doc maintaining team, just as the PEP editor team that was created recently, to deal with the (hopefully relatively large ;) ) amount of comments and edit requests.
I agree. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Thu, May 24, 2007 at 12:43:18PM -0500, Ron Adam wrote:
And for console text output, is the unmodified reST suitable, or would it be desired to modify it in some way?
Currently pydoc output looks like a man page under Unix. if it could look like that then that would be great. Otherwise raw reST is fine!
I think the latter proposal sounds like the correct one. In debian for instance, the python docs are a seperate package, and it would seem reasonable that you'd have to have that package installed to get the long docs.
If you look at the documentation for subprocess for instance, you'll see that the docstring is pretty much the same as the library reference documentation which seems like needless duplication and opportunity for code/doc skew. Maybe one is auto generated from the other - I don't know!
Well you could stop after reading the short bit!
Sounds good! Nick -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick Craig-Wood schrieb:
Okay, there's now support for SmartyPants in Subversion -- it converts these quotes as well as triple dashes to their pretty equivalents. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Sat, May 19, 2007 at 07:14:09PM +0200, Georg Brandl wrote:
For the impatient: the result can be seen at <http://pydoc.gbrandl.de>.
I think that looks great. One comment I have, I don't know if it's relevant - it perhaps depends on whether the "Global Module Index" is auto-generated or not. This is the page I visit the most out of all the Python documentation, and it's far too large and unwieldy. IMHO it would be much better if only the top-level modules were shown here - having the single package 'distutils', for example, take up nearly 50 entries in the list is almost certainly hindering a lot more people than it helps. It would perhaps be better if such packages show up as one entry, which shows the sub-modules when clicked on.

>> One comment I have, I don't know if it's relevant - it perhaps >> depends on whether the "Global Module Index" is auto-generated or >> not. This is the page I visit the most out of all the Python >> documentation, and it's far too large and unwieldy. IMHO it would be >> much better if only the top-level modules were shown here - having >> the single package 'distutils', for example, take up nearly 50 >> entries in the list is almost certainly hindering a lot more people >> than it helps. It would perhaps be better if such packages show up as >> one entry, which shows the sub-modules when clicked on. Georg> Sure, that is certainly possible. Take a look at <http://www.webfast.com/modindex/>. It records request counts for the various pages and presents the most frequently requested pages in a section at the top of the page. I can make the script available if anyone wants it (it uses Myghty - Mason in Python.) Skip

On 5/21/07, skip@pobox.com <skip@pobox.com> wrote:
+1 for integrating this with the official docs. I loved this the last time you posted it too. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On Mon, May 21, 2007, Jon Ribbens wrote:
That's a good point in general, but I think we want to manually label some submodules as having entries in the global module index (notably os.path). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Look, it's your affair if you want to play with five people, but don't go calling it doubles." --John Cleese anticipates Usenet

Hoi, Additionally to the offline docs that Georg published some days ago there is also a web version which currently looks and works pretty much like the offline version. There are however some differences that are worth knowing: - Cleaner URLs. You can actually guess them because we took the idea the PHP people had and check for similar pages if a page does not exist. We do however redirect if there was a match so that the URL stays unique. - The search doesn't require JavaScript (but is currently disabled due to a buggy stemmer and indexer) That's it for now, you can try it online at http://pydoc.gbrandl.de:3000/ Regards, Armin


On 5/23/07, Georg Brandl <g.brandl@gmx.net> wrote:
Also, try
Beautiful! STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Am Wed, 23 May 2007 08:30:17 +0200 schrieb Georg Brandl <g.brandl@gmx.net>:
Looks good. But should the source pages really use syntax highlighting? I think if somebody is interested in the source then they should get the real source without any highlighting. If you decide to keep the syntax highlighting then the highlighting of multiline ReST strings should be fixed. For example see the source for splitext(). Thanks for the work, Dennis Benzinger

Hoi, Due to some server issues I had to take the web version down. But expect an updated version in a few days. Regards, Armin

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
[SNIP]
Waiting for comments!
Here a small suggestion, move the sidebar to the right. Moving it to the right makes it much less intrusive. See that by yourself: http://peadrop.com/files/pydoc-sidebar-right.png div.body { background-color:white; margin:0pt 190pt 0pt 0px; } div.sidebar { float:right; margin-left:-100%; width:230px; } Keep up the great work, -- Alexandre

Hi, We managed to get an up to date version of the web version of the docs running on the server. The address is still the same (http://pydoc.gbrandl.de:3000) and it's also still running on top of wsgiref. Changes so far: * comments: each page that is generated from an rst file can have some comments attached to it. Commenting doesn't require registration at the moment. * antispam with optional reverse captcha (captcha for bots, a hidden input field named "homepage" which bots hopefully fill out, dumb as they are) and a regular expression filter rules based on MoinMoin's BadContent file. * administration panel for moderating comments. You can find the admin panel at http://pydoc.gbrandl.de:3000/admin/ -- login credentials are testuser:password) * feeds for comments on a page or the last n comments on the whole site. * source view is text only (again). What still works: * intelligent error pages: if a page does not exist the URL path is used to conduct a fuzzy keyword search (see below). * fuzzy keyword search: "os.path.exists" jumps to the entry, "os.paht.exists" shows some possibilities. What needs to be implemented: * full text search * proposing documentation patches Note that the comment area is really, really dark, that's intentional. This is meant to visually separate comments from the official docs, but if the constrast is deemed to unsettling, another way can be found. Also, we're experimenting with alternate stylesheets, e.g. placing the sidebar on the right of the main text, or a "traditional" style for those liking the original docs' style. In any case, we're waiting for your input! cheers, Georg and Armin -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Yes, the comments are a bit too dark. The separation could be done better by moving it below the footer. Or better yet, duplicate the navigation bar between the the document page and the comments. ------------------------------------------------ crumbs navagation ------------------------------------------------ side | main page bar | | | ------------------------------------------------ crumbs navagation ------------------------------------------------ User Comment section ------------------------------------------------ copy right ------------------------------------------------ The user comment section could have it's own side bar if that's desirable. Also the python version information needs to be on every page someplace. Ron

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
Very cool! I'd love to see the docs move to ReST.
Yes, these would all be outstanding features. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Georg Brandl <g.brandl@gmx.net> wrote:
I'm generally a curmudgeon when it comes to 'the docs could be done better'. But this? I like it. A lot. Especially if you can get these other features in:
I'm a bit iffy on yet another tool, but if roundup integration could be done, I think it would be great. - Josiah

On Sat, May 19, 2007 at 10:48:29AM -0700, Josiah Carlson wrote:
Seconded! -- even if it's just for modules, this would be great. I can't count the times I've wished I could type e.g., 'docs.python.org/httplib' the way I can type 'php.net/array_search' to try to find out whether the needle comes before or after the haystack. Dustin

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
[snip]
Waiting for comments!
Awesome, Georg! Wow. Nice work. Seems like this has been a long time comin', and I bet others have been working away "in secret" on similar projects. I hope you keep running with it until it gets hijacked into being the "official" versions. :) I'm bookmarking it as "python docs" in my browser. BTW, would like to see a little blurb of your own on that page about how the docs were converted, rendered, and their new source format. Thanks much, ---John P.S. -- funny sig, btw. :)

Georg Brandl wrote:
Awesome, that looks pretty amazing! I'd reeeally like to have a look at the source code (don't worry if it's not clean!). Can you publish or post it somewhere? If you'd like to store it in the Docutils sandboxes, just drop me a line and I'll give you SVN access. By the way, things get a lot easier for me if you place it in the public domain, because that's the license Docutils uses, and it's obviously compatible to every other license. I actually have a Google Summer of Code project, "Documenting Python Packages with Docutils", which I'll start working on May 28: <http://code.google.com/soc/psf/appinfo.html?csaid=8D04C53750906F50>. It has a somewhat different scope, so our projects will complement each other nicely I believe. (To the point where we may end up with a complete tool-chain to actually migrate the Python documentation to reST. Très cool.) Your effort and mine only seem to have some limited overlap. I see that you added at least some markup to reST that allows documents to be marked up in a similar fashion as the current Python-specific LaTeX markup, which is on my list, too. If you see more overlap, please let me know, because I may need to adjust my time-line or project-scope (which is totally fine with me, by the way, so don't worry about "getting into the way of my project" or so!). May I suggest we continue the discussion on Doc-SIG only and prune Python-dev from the CC? I'm on Jabber/GoogleTalk (LeWiemann@gmail.com), by the way, so feel free to IM me. Best wishes, Lea [Rest of the quoted message below.]

Georg Brandl wrote:
Wow, very nice. I like it.. +1 I've been working on improving pydoc. (slowly but steadily) Maybe we can join efforts as I think the two projects overlap in a number of areas, and it sounds like we are thinking of some of the same things as far as the tool chain. So maybe there's some synergy we can take advantage of. Some of the things I've recently considered adding to pydoc. - To output individual sections for use in a template engine. - A reST formatter. - Use docutils to format reST doc strings to html in the html formatter. (as an option, not the default.) It looks like there may be a few areas where we can share code. - The html syntax highlighters. (Pydoc can use those) - A shared html style sheet. - Document locater. [1] - An HTMLServer for local (dynamic dispatching) html viewing. [2] - The reST source for viewing topics and keywords in pydoc. (Instead of scraping html pages. Ick) (1.) Pydoc has a locater function which finds the html docs and presents a link to the html page for an individual item. But it would be more reliable if the dispatcher where on the document end. Then pydoc would have a single place to link to. (Either locally or on line.) (2.) The server in pydoc will probably work as is. You just need to supply a callback to get the pages. It's a separate module now. Cheers, Ron

On Sat, May 19, 2007 at 03:31:59PM -0500, Ron Adam wrote:
- The html syntax highlighters. (Pydoc can use those)
I have a patch on the docutils patch tracker that does this. Code is probably of a rather bad quality, but it outputs LaTeX and HTML. If we can work together to improve this patch and get it in docutils it will avoid having different syntaxes and behavior depending on the front-end to docutils being used (I am thinking of rest2web, trac, and I am probably forgetting some others). The patch has been sitting there for almost 6 months without review, but I have that if people other than me work on it and ask for review it will both improve, and get reviewed, and eventually get in ! Sorry for the shameless plug, but I really do think we need a unifying approach to this. Gaël

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
From a doc writer's perspective I find this reST approach much easier to grapple than the LaTeX one since I find reST markup nicer for simple things
I really want this! like lists and bolding. From a committer's POV I like this as it should hopefully get more people to help with changes and make it easy for me to build the docs locally to make sure the markup is correct. And from a lazy coder's POV I love it as Georg has already done all the work (and in Python so if I really have to change something I have a better chance of figuring out how). -Brett

On Sat, 19 May 2007, Georg Brandl wrote:
For the impatient: the result can be seen at <http://pydoc.gbrandl.de>.
This is extremely impressive. Thank you for this work! If all the documentation is generated from a base format that is closer to text (reST instead of LaTeX), that will make it easier for volunteers to read diffs, make edits, and contribute patches. I agree that interactivity (online commenting and editing) will be a huge advantage. I could imagine this heading in a Wiki-like direction -- where a particular version is tagged as the official revision shipped with a particular Python release, but everyone can make edits online to yield new versions, eventually yielding the revision that will be released with the next Python release. -- ?!ng

Georg Brandl wrote:
Very impressive. I should say that although in the past I have argued strongly against the use of reST as a markup language for source-code comments (because the reST language only indicates presentation, not semantics), I am 100% supportive of the use of reST in reference documents such as these, especially considering that LaTeX is also a presentational markup (at least, that's the way it tends to be used.) I know that for myself, LaTeX has been a barrier to contributing to the Python documentation, and reST would be much less of a barrier. In fact, I have considered in the past asking whether or not the Python documentation could be migrated to a format with wider fluency, but I never actually posted on this issue because I was afraid that the answer would be that it's too hard / too late to do anything about it. I am glad to have been proven wrong. -- Talin

Hi Georg Super impressive work! :-) I haven't looked at it in depth yet, but I have a question. One concern from a long thread on Doc-Sig a long time ago, is that ReST did not at the time possess the ability to nicely markup the objects as LaTeX macros do. Is your transformation losing markup information from the original docs? e.g. are you still marking classes as classes and functions as functions in the ReST source, or is it converting from qualified markup to "style" markup (e.g., to generic literals instead of class/function/variable/keyword argument docutils roles, etc.). If you solved that problem, how did you solve it? Is the resulting ReST pretty? Do you think we can build a better index? My beef with using ReST for documentation, as much as I like ReST, is that unless we have roles and structure for declaring functions, classes, etc. it would remain inferior to the LaTeX macros, which in spite of being LaTeX, qualify the kinds of objects to some extent. Wow, it looks amazingly good. Amazing work. Very impressed. (Somewhat related, but another idea from back then, which was never implemented IMO, was to find a way to automatically pull and convert the docstrings from the source code into the documentation, thus unifying all the information in one place.) On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:

On 5/19/07, Martin Blais <blais@furius.ca> wrote:
Looking at http://pydoc.gbrandl.de/modules/collections.txt, I can see it has markup like:: .. class:: deque([iterable]) Returns a new deque object initialized left-to-right (using :meth:`append()`) with data from `iterable`. If `iterable` is not specified, the new deque is empty. .. method:: deque.append(x) Add `x` to the right side of the deque. So he's clearly got some of the info in there with things like ``.. class::`` and ``:meth:``. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Martin Blais wrote:
e.g. are you still marking classes as classes and functions as functions in the ReST source
It seems so (modulo XXX's and TODO's in Georg's implementation, probably ^_^) -- all of the pages have "show source" links, so you can see for yourself. I'm not an expert with the documentation system, but the markup on <http://pydoc.gbrandl.de/modules/codecs.txt> looks pretty complete to me.
While it's probably not possible to simply generate the documentation from the docstrings, it would certainly seem interesting to get have some means (like a directive) to pull docstrings into the documentation. I think however that while migrating the docs do reStructuredText is comparatively straightforward [1]_, pulling documentation from the docstrings will require quite a bit of design and discussion work. So I'd suggest we postpone this idea until we have a working documentation system in reStructuredText, so we don't clutter the discussion. .. [1] I'm sure there will still be quite a few issues to sort out that I'm simply not seeing right now. Best wishes, Lea

Georg Brandl <g.brandl <at> gmx.net> writes:
For the impatient: the result can be seen at <http://pydoc.gbrandl.de>.
- the toolchain is pure Python, therefore can easily be shipped
Very nice! As well as looking very attractive and professional, the all-Python toolset should make it easier to build the documentation - I've not been able to get a trouble-free setup of the docs toolchain on Windows. Thanks for this, Vinay Sajip

Vinay Sajip schrieb:
BTW, I have to give lots of credit for the looks to Armin Ronacher. I'm not so much of a designer ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

[warning: bulk answer ahead] First, thanks for all those nice comments! [John Gabriele]
BTW, would like to see a little blurb of your own on that page about how the docs were converted, rendered, and their new source format.
Sure. I've already written part of the new "Documenting Python" docs, which cover this a bit. The "About the documentation" will be rewritten too. [Lea Wiemann]
The toolset is now in the Docutils sandbox at <http://svn.berlios.de/svnroot/repos/docutils/trunk/sandbox/py-rest-doc>.
Great! Making the new toolset usable for third-party developers is certainly a good option. I saw quite a few using the LaTeX-based tools too.. [Ron Adam]
Certainly there's plenty of overlap.
It looks like there may be a few areas where we can share code.
- The html syntax highlighters. (Pydoc can use those)
The highlighting is actually done with Pygments, which cannot be included in the stdlib as-is. Perhaps a stripped-down version?
Yes, that makes sense. If you want to coordinate efforts, feel free to contact me at Jabber <gbrandl@pocoo.org>. [Ka-Ping Yee]
I agree that interactivity (online commenting and editing) will be a huge advantage.
Yes. I think that always only the latest version should be "publicly" interactive. Old archived doc versions should be static only.
As Steven said, it's solved quite nicely with interpreted text roles. Whether ":class:`Foo`" is nicer than "\class{Foo}" is not entirely clear ;) but you actually get more now, since if a class "Foo" is found in the namespace, it will be cross-linked automatically. About the index: I didn't do anything about it. I just transferred the LaTeX commands into reST directives, the rest is generated completely analogous.
Yep. As it is now, you need three packages from the Cheese Shop: Docutils, Pygments (the highlighter) and Jinja (the templating engine). This shouldn't be problematic, though they could also be stripped down and included. Cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 5/20/07, Georg Brandl <g.brandl@gmx.net> wrote:
This is great. IMHO if this is to compete to become the official Python docs, I would argue for even less dependencies, even at the cost of more generic/bland output, for portability reasons and to stimulate greater adoption. If we can make some of those dependencies optional and only rely on docutils, that could make it ubiquitous. Another thing to keep in mind: I don't know if the directives you defined are very generic, but if they are, it would be interesting to consider migrating them up into docutils (if it makes sense), and see if they could support documenting other programming languages. Could this be a language-independent documenting toolkit? Could we document LISP or Ruby code with it? Georg, thanks again!

Could this be a language-independent documenting toolkit? Could we document LISP or Ruby code with it?
Might want to look at "noweb", http://www.eecs.harvard.edu/~nr/noweb/: ``...noweb works ``out of the box'' with any programming language, and supports TeX, latex, HTML, and troff back ends.'' Bill

[warning: bulk answer ahead] First, thanks for all those nice comments! [John Gabriele]
BTW, would like to see a little blurb of your own on that page about how the docs were converted, rendered, and their new source format.
Sure. I've already written part of the new "Documenting Python" docs, which cover this a bit. The "About the documentation" will be rewritten too. [Lea Wiemann]
The toolset is now in the Docutils sandbox at <http://svn.berlios.de/svnroot/repos/docutils/trunk/sandbox/py-rest-doc>.
Great! Making the new toolset usable for third-party developers is certainly a good option. I saw quite a few using the LaTeX-based tools too.. [Ron Adam]
Certainly there's plenty of overlap.
It looks like there may be a few areas where we can share code.
- The html syntax highlighters. (Pydoc can use those)
The highlighting is actually done with Pygments, which cannot be included in the stdlib as-is. Perhaps a stripped-down version?
Yes, that makes sense. If you want to coordinate efforts, feel free to contact me at Jabber <gbrandl@pocoo.org>. [Ka-Ping Yee]
I agree that interactivity (online commenting and editing) will be a huge advantage.
Yes. I think that always only the latest version should be "publicly" interactive. Old archived doc versions should be static only. [Martin Blais]
As Steven said, it's solved quite nicely with interpreted text roles. Whether ":class:`Foo`" is nicer than "\class{Foo}" is not entirely clear ;) but you actually get more now, since if a class "Foo" is found in the namespace, it will be cross-linked automatically. About the index: I didn't do anything about it. I just transferred the LaTeX commands into reST directives, the rest is generated completely analogous.
Yep. As it is now, you need three packages from the Cheese Shop: Docutils, Pygments (the highlighter) and Jinja (the templating engine). This shouldn't be problematic, though they could also be stripped down and included. Cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

[Georg Brandl]
The highlighting is actually done with Pygments, which cannot be included in the stdlib as-is. Perhaps a stripped-down version?
No need to; we can just fall back to no syntax highlighting if Pygments is not installed on the user's system. [Gael Varoquaux]
- The html syntax highlighters. (Pydoc can use those)
I have a patch on the docutils patch tracker that does this.
For everyone's reference, <http://sf.net/tracker/index.php?func=detail&aid=1595345&group_id=38414&atid=422032>. Best wishes, Lea

Sounds very interesting. I just have one concern/question. I hope that while moving away from latex, we are not precluding the ability to write math as part of the documentation. What would be my choices for add math to the documentation? Hopefully using latex, since there really isn't AFAIK any other competitor for this.

Neal Becker wrote:
Where in the current documentation is there any math notation /at all/? In all my reading of it, I have not run across anything that appeared like it was being used. Besides that question, is the full power of LaTeX math notation really necessary here? I somehow doubt anything more than simple expressions of runtime performance and container behaviors are appropriate for any documentation we have. -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

Scott Dial schrieb:
There is exactly one instance of LaTeX math in the whole docs, it's in the description of audioop, AFAIR, an contains a sum over square roots... So, that's not really a concern of mine ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Neal Becker wrote:
I don't think so. The issue with numpy is getting our act together and making parseable docstrings for auto-generated API documentation using existing tools or slightly modified versions thereof. No one is actually contemplating building a new tool. AFAICT, Georg's (excellent) work doesn't address that use. I don't think there is anything to coordinate, here. Provided that Georg's system doesn't place too many restrictions on the reST it handles, we could use the available reST math options if we wanted to use Georg's system. I'd much rather see Georg spend his time working on the docs for the Python language and the feature set it requires. If the numpy project needs to extend that feature set, we'll provide the manpower ourselves. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Robert Kern schrieb:
Indeed, I don't intend to do anything about docstrings. IMO, docs automatically generated from docstrings can work, but only if there's a single consistent style applied, and the whole thing is written in a markup language, of course, not text only. This is not the case for the Python standard library, so converting it is not an option; in any case, putting all information that is available in the docs into the docstrings would make many modules much less readable.
Of course, for numpy math is much more of importance than for the core. I'm sure the docutils developers will be supportive in case someone volunteers to create/improve reST math capabilities. cheers, Georg

>>> What would be my choices for add math to the documentation? >> Where in the current documentation is there any math notation /at >> all/? Georg> There is exactly one instance of LaTeX math in the whole docs, Georg> it's in the description of audioop, AFAIR, an contains a sum over Georg> square roots... Georg> So, that's not really a concern of mine ;) You must realize that people will use the core tools to create documentation for third party packages which aren't in the core. If you replace LaTeX with something else I think you need to keep math in mind whether it's used in the core documentation or not. Skip

I disagree. The documentation infrastructure of Python should only consider the needs of Python itself. If other people can use that infrastructure for other purposes, fine - if they find that it does not meet their needs, they have to look elsewhere. We are developing a programming language here, not a typesetting system. Regards, Martin

On 5/20/07, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Martin beat me to my comment. =) Python's needs should come first, period. If Georg wants to add math support, fine. But honestly I would rather he spend his time on Python-specific stuff then get bogged down to support possible third parties. -Brett

I would agree with the point that python core should be considered first, but I would also only see beneficial to leave the door open to the need of other packages. I have (briefly but intensely) worked on a revamp of pydoc earlier on this year, and while collecting requirements from a number of places having maths expressions or else appeared important for a number of cases (and a very reasonable request in a case) . That particular point leads to something that I see important for what a new/better documentation system should provide: a good and modular interface to access the documentation, process it, and navigate it. When looking at the particular example discussed here, it could be implemented by allowing a "pluggable" processing components for docstrings (and let a given package developer the possibility to use as much as the default documentation machinery as possible and implement the processing mathml, latex, whatever, as wanted). One can consider the possibility to have the "custom" processing of the docstring embedded in the package itself. Laurent 2007/5/21, Brett Cannon <brett@python.org>:

Brett> Martin beat me to my comment. =) Python's needs should come Brett> first, period. If Georg wants to add math support, fine. But Brett> honestly I would rather he spend his time on Python-specific Brett> stuff then get bogged down to support possible third parties. I think the people who have responded to my comment read too much into it. Nowhere do I think I asked Georg to write an equation typesetter to include in the Python documentation toolchain. I asked that math capability be considered. I have no idea what tools he used to build his new documentation set. I only briefly glanced at a couple of the output pages. I think what he has done is marvelous. However, I don't think the door should be shut on equation display. Is there a route to it based on the tools Georg is using? If not, then I think some accommodation should be made. I'm being vague here on purpose because I'm unfamiliar with the available tools. The one thing I do know is that LaTeX provides that today and by removing it from the toolchain you have removed a significant piece of functionality. Skip

skip@pobox.com schrieb:
And that is reasonable, of course.
In the end, it all depends on what kind of support basic reST can deliver. IMO, you still get the best math output from LaTeX, but I don't really know many other things. That is also something I want to convey: I'm very fond of LaTeX, and use it regularly for all my University work. For the Python docs, however, I can see many advantages of the docutils approach.
That's the point I see differently: for the Python core docs, it's not significant, and my efforts are primarily limited to that area. cheers, Georg

>> You must realize that people will use the core tools to create >> documentation for third party packages which aren't in the core. If >> you replace LaTeX with something else I think you need to keep math >> in mind whether it's used in the core documentation or not. Martin> I disagree. The documentation infrastructure of Python should Martin> only consider the needs of Python itself. If other people can Martin> use that infrastructure for other purposes, fine - if they find Martin> that it does not meet their needs, they have to look elsewhere. Then I submit that you are probably removing some significant piece of functionality from the provided documentation toolchain which some people probably rely on. After all, that's what LaTeX excels at. They will be able to continue to use the old tools, but where will they get them if they are no longer part of Python? Skip

Fred L. Drake, Jr. schrieb:
That is a good idea! The converter is not likely to work with other projects out of the box (it's been finetuned for the Doc/ sources), and it's not clear whether they would want to switch in any case. Many of the features that the new system would be able to provide aren't needed for them anyway, and I, as a maintainer, would be very reluctant to put extra work in that too... cheers, Georg

On Mon, May 21, 2007 at 09:23:47AM -0400, Fred L. Drake, Jr. wrote:
That seems like a straightforward task. The big stumbling block in switching away from LaTeX has always been the effort of making a good conversion; if Georg's work does 80% of the job, we should definitely take advantage of the opportunity and try to switch. Advantages of reST: * The required tool chain shrinks (at least if you're not making printed output, which will probably still go through LaTeX). * Tool chain is now more easily scriptable, and it'll be easier to make use of the docs from Python code. * We can produce XML output through the rst2xml script. Disadvantages: * reST markup isn't much simpler than LaTeX. --amk

Hoi, Fred L. Drake, Jr. <fdrake <at> acm.org> writes:
For a lightweight markup language that is human readable (which rst certainly is) the syntax is surprisingly powerful. You can nest any block tag and I'm not sure how often you have to nest roles and stuff like that. The goal of the new docs is a less complex syntax and currently nothing beats reStructuredText in terms of readability and possibilities. rst is simpler than latex: LaTeX: \item[\code{*?}, \code{+?}, \code{??}] The \character{*}, \character{+}, and \character{?} qualifiers are all \dfn{greedy}; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE \regexp{<.*>} is matched against \code{'<H1>title</H1>'}, it will match the entire string, and not just \code{'<H1>'}. Adding \character{?} after the qualifier makes it perform the match in \dfn{non-greedy} or \dfn{minimal} fashion; as \emph{few} characters as possible will be matched. Using \regexp{.*?} in the previous expression will match only \code{'<H1>'}. Here the same in rst: ``*?``, ``+?``, ``??`` The ``'\*'``, ``'+'``, and ``'?'`` qualifiers are all :dfn:`greedy`; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE :regexp:`<.\*>` is matched against ``'<H1>title</H1>'``, it will match the entire string, and not just ``'<H1>'``. Adding ``'?'`` after the qualifier makes it perform the match in :dfn:`non-greedy` or :dfn:`minimal` fashion; as *few* characters as possible will be matched. Using :regexp:`.\*?` in the previous expression will match only ``'<H1>'``. Regards, Armin

Armin Ronacher writes:
IMO that pair of examples shows clearly that, in this application, reST is not an improvement over LaTeX in terms of readability/ writability of source. It's probably not worse, although I can't help muttering "EIBTI". In particular I find the "``'...'``" construct horribly unreadable because it makes it hard to find the Python syntax in all the reST. I don't think that's an argument against switching to reST, though. Georg's site speaks for itself. Kudos!

In your examples, I think the ReST version can be cleaned up quite a bit. First by using the .. default-role:: literal directive so that you can type `foo()` instead of using double back quotes and then you can remove the redundant semantic markup. Like this: `\*?`, `+?`, `??` The "`*`", "`+`" and "`?`" qualifiers are all *greedy*; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE `<.*>` is matched against `'<H1>title</H1>'`, it will match the entire string, and not just `'<H1>'`. Adding "`?`" after the qualifier makes it perform the match in *non-greedy* or *minimal* fashion; as *few* characters as possible will be matched. Using `.*?` in the previous expression will match only `'<H1>'`. The above is the most readable version. For example, semantic markup like :regexp:`<.\*>` doesn't serve any useful purpose. The end result is that the text is typesetted with a fixed-width font, no matter if you prepend :regexp: to it or not. -- mvh Björn

It would appear that while we slept Jens Mortensen was busy at work on his rst2{latex,latexmath,mathml}.py scripts: http://docutils.sourceforge.net/sandbox/jensj/latex_math/ Note the date on the files. It seems to work pretty well, and as others have pointed out, LaTeX notation is probably more familiar to people interested in math display than anything else. Skip

Neal> I know almost nothing about docutils internals. How do I Neal> 'install' the above? Me either. Here's what I did: * download and expand the latest docutils snapshot * replicate Jens's work in a directory called "math" at the top level of the docutils directory * edited setup.py to get them installed: diff -u setup.py.~1~ setup.py --- setup.py.~1~ 2007-03-21 18:45:38.000000000 -0500 +++ setup.py 2007-05-22 07:07:25.000000000 -0500 @@ -115,6 +115,9 @@ 'tools/rst2xml.py', 'tools/rst2pseudoxml.py', 'tools/rstpep2html.py', + 'math/tools/rst2latex.py', + 'math/tools/rst2latexmath.py', + 'math/tools/rst2mathml.py', ],} """Distutils setup parameters.""" * ran "python setup.py install" That's probably more than necessary, but with the math subdir I can easily move the whole thing to a new snapshot and the setup.py change lets me install them transparently. Skip

Stephen J. Turnbull wrote:
It's interesting how perceptions can differ - I find heavily marked up latex tends to blur into a huge wall of text when I try to read it because of all of the {} and \ characters everywhere. With reST, on the other hand, I find the use of the relatively 'light' backquote and colon characters to delineate the markup breaks things up sufficiently that I can easily ignore the markup in order to read what has actually been written. So in Armin's example, I found the reST version *much* easier to read. Whether that difference in perception is due simply to my relative lack of experience in using LaTeX, or to something else, I have no idea. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On 5/22/07, Nick Coghlan <ncoghlan@gmail.com> wrote:
- If you make a mistake in LaTeX, you will get a cryptic error which is usually a little difficult to figure out (if you're not used to it). You can an error though. - If you make a mistake in ReST, you will often get no warning nor error, and not the desired output. If you were to use the amount of markup in that example, you would have to check your text with rst2xml frequently to make sure it groks what you're trying to say. (And I've been there: I wrote an entire project who relies specifically on this, on precise structures generated by docutils (http://furius.ca/nabu/). It's *very* easy to make subtle errors that generate something else than what you want.) ReST works well only when there is little markup. Writing code documentation generally requires a lot of markup, you want to make variables, classes, functions, parameters, constants, etc.. (A better avenue IMHO would be to augment docutils with some code to automatically figure out the syntax of functions, parameters, classes, etc., i.e., less markup, and if we do this in Python we may be able to use introspection. This is a challenge, however, I don't know if it can be done at all.)

Martin Blais schrieb:
That is correct, but can be helped with nice preview features.
While writing the converter, I stumbled about a few locations where the LaTeX markup cannot be completely converted into reST, and a few locations where invalid reST was generated and not warned about. However, both of those problems occurred far less often than I'd anticipated. Georg

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

Raymond Hettinger schrieb:
I'm not saying that LaTeX is hard for most of us, I say that it is *perceived* to be hard by many. And as soon as you dig into the deep support infrastructure, it gets very confusing. Just look at this bug fix I made some time ago: http://mail.python.org/pipermail/python-checkins/2007-April/059637.html
Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs.
ISTM that this is possible with the new markup too. I wrote a great part of the new "Documenting Python" document, and after reading that one should be prepared enough to write reST just as well.
I can't see many differences. If I can translate a \begin{classdesc}{...} environment, I can also translate a ".. class::" directive for my new item.
Are the docs really that complex? I mean, look at the typical source of a converted page. The most common things are "information units", i.e. .. class:: directives, code snippets and plain old text. Cross-references work flawlessly. You may also ask Thomas Heller about documenting Python modules in reST. AFAIR the ctypes docs were written with it and converted to LaTeX afterwards.
Sometimes it's the way you present the ability to change things that affects how many people actually do it. Finding the location that tells you how to suggest changes, and opening a new bug in the infamous SF tracker is not really something people do happily. A "click here to suggest a change" link that leads to a pseudo- edit-form, complete with preview facility, might prove more effective. cheers, Georg

On Tue, May 22, 2007 at 06:13:36PM +0200, Georg Brandl wrote:
Indeed. I know my instinctive reaction to the Python docs is "oh, this is not something which the public can contribute to". Something more like the PHP-style "public annotations" might be good - with an appropriate moderation / voting system on the annotations it could possibly be very good.

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

FWIW, the pure Python program in Tools/scripts/texchecker.py does a pretty good job of catching typical LaTeX mistakes and giving high-quality error reporting. With that tool, I've been making doc contributions for years and not needed my own LaTeX build. Also, I did not need to learn LaTeX itself. It was sufficient to read a little of Documenting Python and then model the markup from existing docs. In contrast, whenever I've tried to build a complex ReST document, it was *always* a struggle. Copying from existing docs doesn't help much there because the cues are more subtle. As Martin pointed out, most errors slide-by because the mis-markup is typically read as valid, unmarked-up text. I find myself having to continously build and view and html file as I write. I like ResT for light-weight work but think it is not ready for prime-time with respect to more complex requirements. Fred is also correct in that we don't seem to have people rushing to contribute docs (more than a line or two). For a long-time, we've always said that it is okay to submit plain text doc contributions and that another person downstream would do the mark-up. We've had few takers. Raymond

On 5/22/07, Martin Blais <blais@furius.ca> wrote:
Just to follow-up on that idea: I don't think it would be very difficult to write a very small modification to docutils that interprets the default role with more "smarts", for example, you can all guess what the types of these are about: `class Foo` (this is a class Foo) `bar(a, b, c) -> str` (this is a function "bar" which returns a string) `name (attribute)` (this is an attribute) ...so why couldn't the computer solve that problem for you? I'm sure we could make it happen. Essentially, what is missing from ReST is "less markup for documenting programs". By restricting the problem-set to Python programs, we can go a long way towards making much of this automatic, even without resorting to introspecting the source code that is being documented.

Martin Blais wrote:
I was going to suggest something similar. Ideally, any markup language ought to have a kind of "Huffman Coding" of complexity - in other words, the markup symbols that are used the most frequently are the ones that should be the shortest and easiest to type. Just as in real Huffman Coding, the popularity of a given element is going to depend on context. This would imply that there should be customizations of the markup language for different knowledge domains. While there are some benefits to having a 'standard' markup, any truly universal markup is going to be much heavier and more cumbersome than one that is specialized for the task. I would advocate a system in which the author inserts minimalistic 'hints' into the text, and the document processor uses those hints along with some automatic reasoning to determine the final markup. As in the above example, the use of backticks can be signal to the document processor that the enclosed text should be examined for identifiers and other Python syntax. I would also suggest that one test for evaluating the quality of markup syntax is whether or not it can be learned by example - can a user follow the pattern of some other part of the docs, without having to learn the syntax in a formal way? -- Talin

Talin wrote:
Does this mean it's time for "pyST" -- Python-structured text?-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

Greg Ewing wrote:
I wasn't going to say it :) Now, at the risk of going even further out of the mainstream (actually, there's no risk, it's a dead certainty), if I had been clever enough to think that I could write a LaTeX translator, I probably would have made my target language Docbook or some other flavor of XML. Now, you might argue that XML is more cumbersome and harder to author than reST, and that is certainly a valid argument. On the other hand, there are a couple of interesting advantages to using XML: 1) You get an instant WYSIWYG preview capability by publishing a standard CSS stylesheet along with the docs. Anyone would be able to see what the output would look like merely by viewing it in a browser. While there would be some document transformations which would be not be previewable in CSS (such as breaking the document up into hyperlinked chapters), you would at least be able to see enough to be able to do a decent job of editing the text without having to install any special tools. And some of those more difficult transformations would be doable with a suitable XSTL stylesheet, which can be directly executed in most browsers. (As an example, I once wrote an XSLT stylesheet that converted OpenDocument XML into the equivalent HTML - this was part of my Firefox ODFReader plugin [http://www.alcoholicsunanimous.com/odfreader/], that allowed ODF documents to be directly viewed in the browser without having to launch an external helper application.) 2) There are a few WYSIWYG XML editors out there, which allow you to edit the styled text directly in an editor (although I don't know of any open source ones.) 3) The document processing tool could be very minimal, mostly assembled out of standard modules for processing XML. 4) XML has a well-specified method of escaping into other (XML-based) languages, which is XML namespaces. So for those who want equations in their docs, they could simply insert a block of MathML inside their Docbook XML. Similarly, illustrations could be embedded using bitmap images or SVG as appropriate. 5) Having XML-based docs would make it easy to write other kinds of processors that operate on the docs in different ways, such as building a keyword index or doing various kinds of analysis. Now, this suggestion of using XML isn't really a serious one. But I think that the various advantages that I have listed ought to be considered when thinking about how the tool chain for python documentations should operate. I think that there is a big advantage to making the document processing tools simple and hosted entirely in Python. People who contribute to the docs are likely to know quite a bit about Python, but it is far from certain what else they might know. And tools written in Python are automatically able to run in diverse environments, which may not be the case for tools written in other languages. This means that tools that are in Python are more likely to be used, and further, they are more likely to be improved or specialized to the task by those who use them. In terms of authoring, the convenience of the markup language is only one factor; A bigger factor I think is having a short feedback cycle between edit and test, where 'test' means seeing what your written text would look like in the finished product. The quicker you can make that feedback loop, the more likely people will be to work on the docs. -- Talin

On 5/24/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Not before someone writes it. Georg Brandl's awesome ReST based system has the nice property that it actually exists and works. For a great number of reasons it is superior to the existing LaTeX based system, and I hope and think that they are strong enough to replace LaTeX with it. -- mvh Björn

Talin schrieb:
What's better here than :class:`Foo` or :attr:`name`? You wouldn't want to put an " (attribute)" after all references to it in your text, so this is just an alternative way to spell roles.
What I could propose is that we could abandon :class:`foo`, :meth:`foo` etc. and just use `foo`. There shouldn't be too many cases where this gets ambiguous crossreferencing. Variables would just use *var*, since they're not marked up speciall anyways.
I think he/she can, given a piece of document that contains most of the needed markup constructs. You'll pretty soon grok that reST uses indentation (and you'll be pleased with it if you like Python, which seems a reasonable assumption ;). You'll also get that :foo:`bar` is the syntax for semantic inline markup. Code examples are always introduced with a "::" (okay, the exact rules are a bit nifty, but very convenient if you know them). What else do you need to know? ".. function::" directives are quite easy to recognize. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On 5/22/07, Armin Ronacher <armin.ronacher@active-4.com> wrote:
About possibilities: I'm sorry but that is simply not true. LaTeX provides the full power of macro expansion, while ReST is a fixed (almost, roles are an exception) syntax which has its own set of problems and ambiguities. It is far from perfect, and definitely does not have the flexibility that LaTeX provides. Some of the syntax cannot be nested (try to combine ** with literals).
rst is simpler than latex:
That, and the ability to already parse it from Python and more easily convert to other formats (one of LaTeX's weaknesses), are the only benefits that I can see to switching away from LaTeX. I have to admit I'm afraid we would be moving to a lesser technology, and the driver for that seems to be people's reluctance to work with the more powerful, more complex tool. Not saying it is invalid (it's about people, in the end), but I still don't see what's the big problem with LaTeX.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On May 22, 2007, at 10:37 AM, Martin Blais wrote:
I'm a fan of LaTeX (and latex, er, oops :) too, but what appeals to me most about moving to reST is that the tool chain simplifies considerably. Even with a nice distro packaging system it can be a PITA to get all the tools you need to build the documentation properly installed. A pure-Python solution, even a lesser one, would be a win if we can still produce top quality online and written documentation from one source. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRlMC7XEjvBPtnXfVAQJoFQQAjvYsXamif459t34X4Bn00G0S1b3qeM1Y PhwdAC5cuCpMoopVl+9vtjjcP4Np9P0buY09H+mLwv0nAZRNF7HT3xDr/U65FiX+ Aa7B9+3jVqRGg1+R6oYRKuPUmcLrBFESy6thKkw9audVsT5jgpBM9m9Y405QSIEU MvK7hYrYBqQ= =Jdbt -----END PGP SIGNATURE-----

On Tuesday 22 May 2007, Barry Warsaw wrote:
The biggest potential wins I see for a new system are: - more contributions - platform-independent processing I remain sceptical on being able to achieve the first, but there some hope for it. The later should make things easier for people who are willing to put the work into contribution, which is valuable in its own right. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

Fred> The biggest potential wins I see for a new system are: Fred> - more contributions Fred> - platform-independent processing Fred> I remain sceptical on being able to achieve the first, but there Fred> some hope for it. You at least take away a common excuse for lack of contributions. True whiners will just come up with new ones (e.g., "the documentation isn't available in Sanskrit yet" or "the dog ate my changes before I could type them into the computer"). ;-) Skip

skip@pobox.com wrote:
But doesn't *everyone* now know that documentation contributions don't have to be marked up? It's certainly been said enough. Maybe that fact should be more prominent in the documentation? Then we'll just have to worry about getting people to read it ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden ------------------ Asciimercial --------------------- Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.com squidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -------------- Thank You for Reading ----------------

>> You at least take away a common excuse for lack of contributions. >> True whiners will just come up with new ones (e.g., "the >> documentation isn't available in Sanskrit yet" or "the dog ate my >> changes before I could type them into the computer"). ;-) Steve> But doesn't *everyone* now know that documentation contributions Steve> don't have to be marked up? It's certainly been said Steve> enough. Sure, but that doesn't stop the true whiners. ;-) Skip

On Tue, May 22, 2007 at 11:45:04AM -0500, skip@pobox.com wrote: -> -> >> You at least take away a common excuse for lack of contributions. -> >> True whiners will just come up with new ones (e.g., "the -> >> documentation isn't available in Sanskrit yet" or "the dog ate my -> >> changes before I could type them into the computer"). ;-) -> -> Steve> But doesn't *everyone* now know that documentation contributions -> Steve> don't have to be marked up? It's certainly been said -> Steve> enough. -> -> Sure, but that doesn't stop the true whiners. ;-) Nothing stops the true whiners ;). I think new and exciting ways of viewing, searching, annotating, linking to/from, and indexing the docs are more important than new formats for (not) writing the docs. For example, this rocks! :: http://pydoc.gbrandl.de/search.html?q=os.path&area=default There have been (good) efforts to wikify the docs in the past. IMO what would make them really work would be to have docs.python.org, the "official" Python docs location, start hosting these efforts. As long as that location is static, I think the majority of users will ignore other efforts. cheers, --titus p.s. Are there good directions for installing the toolchain for current docs building anywhere? I've tried once or twice, but despite a lot of LaTeX experience I could never get everything hooked up right.

Titus Brown wrote:
It would be more impressive if the search string returned hits ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden ------------------ Asciimercial --------------------- Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.com squidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -------------- Thank You for Reading ----------------

Steve Holden schrieb:
This is a JavaScript based search, which will only be (optionally) integrated in the offline version. The online version will get a more sophisticated search. We've just finished to implement the quick dispatcher in the repository. A request to http://docs.python.org/os.path would then redirect to the appropriate module page, as well as http://docs.python.org/?q=os.path Something like http://docs.python.org/?q=os.paht (note the misspelling) would lead to a page with close matching results, os.path being the first of them. (The web app is based on wsgiref, with a few wrappers around it...) cheers, Georg

On Tue, 22 May 2007, Steve Holden wrote:
I think the issue is instant gratification. You don't get the satisfaction of seeing your changes unless you're willing to write them in LaTeX, and that's a pretty big barrier -- a lot of what motivates open source volunteers is the sense of fulfillment. (Hence, by the same nature, Wiki-like editing with instant changes visible online will probably greatly increase contributions.) -- ?!ng

We are developing a programming language here, not a typesetting system.
Good point, Martin. Are you implying that the documentation should be kept in LaTeX, a widely-accepted widely-disseminated stable documentation language, which someone else maintains, rather than ReST, which elements of the the Python community maintain? Bill

Bill Janssen schrieb:
No - I have no particular preference wrt. to the markup language. I can personally live with all of them, and I like none of them. I hear that contributors complain about having to use TeX, and I hear other people say that they were more happy if they could use ReST instead of TeX. Making contributors happy is really what the objective should be (if the quality of the typesetting output is adequate - and most people use the HTML output these days, where latex2html may not have adequate quality). That docutils happens to be written in Python should make little difference - it's *not* part of the Python language project, and is just a tool for us, very much like latex and latex2html. Regards, Martin

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On May 21, 2007, at 12:28 AM, Martin v. Löwis wrote:
I would take a fairly broad view of the term "for Python" though. Specifically, third party modules and applications written for and in Python should be explicitly supported. I think we'd like to see one (preferred) documentation tool chain for core Python, plus the vast number of third party modules and apps. I don't see any reason why Georg's reST-based system couldn't provide that (80/20 rule perhaps?). I'd point to for example the howto templates, which can be easily used for third party applications. Mailman uses howtos for example. BTW, Georg excellent work. I'm a big latex fan and long-time user, but I do think that using reST will open the door to a lot more contributions. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRlHzJXEjvBPtnXfVAQLNXQP/QUuZ2gc/DpoidI9jYt7mr66Z+JYHsslv fe4CvFSjd9OxwA3eOynd9dSOSkO6QHQPDVomW8axEkJJSHWNosnr9gmDZcC75nAD JTt4rGImqwkcVIAzE91pZ3fmce/ltp9p1Ru3B1dDRmXbNgHxZ9njaz60MPFszC/H 19jSBo5sqSU= =hlhi -----END PGP SIGNATURE-----

skip@pobox.com wrote:
Perhaps my comment was misunderstood. I have no objection to a new system, and it does not have to be based on latex. I just hope there will be some escape mechanism that allows math. It happens that for math markup, there isn't really anything better (or more familiar) than latex.

Neal> It happens that for math markup, there isn't really anything Neal> better (or more familiar) than latex. True enough. There is MathML and its offspring, ASCIIMathML, which are probably worth looking at. http://www.w3.org/Math/ http://www1.chapman.edu/~jipsen/asciimath.html I have no idea if either one has backend support for presentation outside the web, but if people are interested in this (probably within the docutils scope) they probably should be considered. ASCIIMathML in particular is probably worth using now within even if you can't convert it to any other format. It's about as readable as LaTeX source. Skip

skip@pobox.com wrote:
MathML is the language used for equations in Open Document Format (aka ISO 26300). I don't know what extra typesetting tricks (if any) they wrap around it, though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Please add a link to the PEP index (which is also missing from docs.python.org, though not from python.org/doc/. And consider at least some PEPs as part of the corpus indexed (ie, those with info not in the regular docs). tjr

Georg Brandl <g.brandl@gmx.net> wrote:
Good idea! Latex is a barrier for contribution to the docs. I imagine most people would be much better at contributing to the docs in reST. (Me included: I learnt latex at university a couple of decades ago and have now forgotten it completely!)
- a "quick-dispatch" function: e.g., docs.python.org/q?os.path.split would redirect you to the matching location.
Being a seasoned unix user, I tend to reach for pydoc as my first stab at finding some documentation rather than (after excavating the mouse from under a pile of paper) use a web browser. If you've ever used pydoc you'll know it reads docstrings and for some modules they are great and for others they are sorely lacking. If pydoc could show all this documentation as well I'd be very happy! Maybe your quick dispatch feature could be added to pydoc too?
It is missing conversion of ``comment'' at the moment as I'm sure you know... You will need to make your conversion perfect before you convince the people who wrote most of that documentation I suspect! -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick Craig-Wood schrieb:
It is my intention to work together with Ron Adam to make the pydoc <-> documentation integration as seamless as possible.
Sorry, what did you mean?
You will need to make your conversion perfect before you convince the people who wrote most of that documentation I suspect!
It already is as good as it gets, barring a few bugs here and there. Which I'd like to hear about, when you find them! cheers, Georg

Georg Brandl <g.brandl@gmx.net> wrote:
So I'll be able to read the main docs for a module in a terminal without reaching for the web browser (or info)? That would be great! How would pydoc decide which bit of docs it is going to show? If I type "pydoc re" is it going to give me the rather sparse __doc__ from the re module or the nice reST docs? Or maybe both, one after the other? Or will I have to use a flag to dis-ambiguate? If you type "pydoc re" at the moment then it says in it MODULE DOCS http://www.python.org/doc/current/lib/module-re.html which is pretty much useless to me when ssh-ed in to a linux box half way around the world...
``comment'' produces smart quotes in latex if I remember correctly. You probably want to convert it somehow because it looks a bit odd on the web page as it stands. I'm not sure what the reST replacement might be, but converting it just to "comment" would probably be OK. Likewise with `comment' to 'comment'. For an example see the first paragraph here: http://pydoc.gbrandl.de/reference/index.html -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick> If you type "pydoc re" at the moment then it says in it Nick> MODULE DOCS Nick> http://www.python.org/doc/current/lib/module-re.html Nick> which is pretty much useless to me when ssh-ed in to a linux box Nick> half way around the world... I get quite a bit of information about re (I've never known /F to be a documentation slouch). Only one bit of that information is a reference to the page in the library reference manual. And if I happen to be ssh'd into a machine halfway round the world through a Gnome terminal I can right mouse over that URL and pop the page up in my default local browser. If you set the PYTHONDOCS environment variable you can point it to a local (or at least different) copy of the libref manual. A flag could be added to pydoc to show that content instead, however being html it probably would be difficult to read unless pumped through lynx -dump or something similar. Skip

On Wed, May 23, 2007 at 05:39:38AM -0500, skip@pobox.com wrote:
Yes it is certainly better than no docs. It doesn't for instance have any regexp info, and I can never remember all the special non matching brackets (eg (?:...) so I have to read for the full docs for that.
I take your point. However the unix tradition is that everything is in the man pages. man pages have expanded over the years to include info pages and you *can* read the full python docs via info, it just isn't quite as convenient as pydoc. I think perl had the right idea with perldoc. You can read all the perl documentation whether it is in module documentation (like docstrings) or general documentation (like the latex docs under discussion). I'd like to see pydoc be a viewer for all the python documentation, not just a subset of it.
I'm assuming that we do reST all the python documentation which would make it easier. -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

2007/5/23, Nick Craig-Wood <nick@craig-wood.com>:
One option is to use a text-mode browser (lynx, links, or the likes). The other is to develop a terminal mode application (currently in pydoc, I believe)
I really think that making pydoc a solid library to extract/search/navigate the documentation offers a lot of interesting perspectives. When one think beyond the application discussed here, there are a lot of tools (ipython, or any IDE for example) that could make great use of the facility. [note: Ron and I seemed to disagree on what (and how) pydoc should be, and that in particular, but I keep a keen interest in having such a library.]

Nick Craig-Wood schrieb:
Ahh, now the dime has fallen ;) (sorry, German phrase) Yes, it should probably use Unicode equivalents of these quotes, as it does with en- and em-dashes. There are also nifty "post-processor" filters which operate on complete HTML pages and replace normal quotes by "smart" ones, perhaps that is the way to go. Georg

Georg Brandl wrote:
Ahh, now the dime has fallen ;) (sorry, German phrase)
In English it's "the penny has dropped", so it's not much different. :-) Although I thought dimes were an American thing, and Germans would be more likely to use a different coin. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

Nick Craig-Wood wrote:
In fairness to Georg, latex2html also misses the smart quotes. See the same paragraph here: http://docs.python.org/ref/front.html -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

Nick Craig-Wood wrote:
What latex does here for typeset output is nice, but it's also a bit of a hack job. The ` and ' characters aren't smart, the fonts just have curved glyphs for them. `` and '' are mapped to additional glyphs using ligatures, again part of the font information. The result, of course, is really nice. :-) Scott Dial wrote:
There's a way to make latex2html do "the right thing" for these, except... it then happily does so even to ` and '' (and `` and '') in code samples, since there's no equivalent to the font information used to handle this in latex. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

Nick Craig-Wood wrote:
Pydoc currently gets topic info for some items by scraping the text from document 'local' web pages. This is kind of messy for a couple of reasons. - The documents may not be installed locally. - It can be problematic locating the docs even if they are installed. - They are not formatted well after they are retrieved. I think this is an area for improvement. This feature is also limited to a small list where the word being searched for is a keyword, or a very common topic reference, *and* they are not likely to clash with other module, class, or function names. The introspection help parts of pydoc are completely separate from topic help parts. So replacing this part can be done without much trouble. What the best behavior is and how it should work would need to be discussed. Keep in mind doc strings are meant to be more of a quick reference to an item, and Pydoc is the interface for that.
If retrieval from the full docs is desired, then it will probably need to be disambiguated in some way or be a separate interface. help('re') # Quick reference on 're'. helpdocs('re') # Full documentation for 're'.

On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote:
And it would be improved by converting the docs to reST I imagine.
I think that if reST was an acceptable form for the documentation, and it could be auto included in the main docs from docstrings then you would find more modules completely documented in __doc__.
Actually if it gave both sets of docs quick, then long, one after the other that would suit me fine. -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick Craig-Wood wrote: > On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote: >> Nick Craig-Wood wrote: >>> So I'll be able to read the main docs for a module in a terminal >>> without reaching for the web browser (or info)? That would be great! >>> >>> How would pydoc decide which bit of docs it is going to show? >> Pydoc currently gets topic info for some items by scraping the text from >> document 'local' web pages. This is kind of messy for a couple of reasons. >> - The documents may not be installed locally. >> - It can be problematic locating the docs even if they are installed. >> - They are not formatted well after they are retrieved. >> >> I think this is an area for improvement. > > And it would be improved by converting the docs to reST I imagine. Yes, this will need a reST to html converter for displaying in the html browser. DocUtils provides that, but it's not part of the library. (?) And for console text output, is the unmodified reST suitable, or would it be desired to modify it in some way? Should a subset of the main documents be included with pydoc to avoid the documents not available messages if they are not installed? Or should the topics retrieval code be moved from pydoc to the main document tools so it's installed with the documents. Then that can be maintianed with the documents instead of being maintained with pydoc. Then pydoc will just looks for it, instead of looking for the html pages. >> This feature is also limited to a small list where the word being searched >> for is a keyword, or a very common topic reference, *and* they are not >> likely to clash with other module, class, or function names. >> >> The introspection help parts of pydoc are completely separate from topic >> help parts. So replacing this part can be done without much trouble. What >> the best behavior is and how it should work would need to be discussed. >> >> Keep in mind doc strings are meant to be more of a quick reference to an >> item, and Pydoc is the interface for that. > > I think that if reST was an acceptable form for the documentation, and > it could be auto included in the main docs from docstrings then you > would find more modules completely documented in __doc__. That would be fine for third party modules if they want to do that or if there is not much difference between the two. >>> If I type "pydoc re" is it going to give me the rather sparse __doc__ >> >from the re module or the nice reST docs? Or maybe both, one after >>> the other? Or will I have to use a flag to dis-ambiguate? >> If retrieval from the full docs is desired, then it will probably need to >> be disambiguated in some way or be a separate interface. >> >> help('re') # Quick reference on 're'. >> helpdocs('re') # Full documentation for 're'. > > Actually if it gave both sets of docs quick, then long, one after the > other that would suit me fine. That may work well for the full documentation, but the quick reference wouldn't be a short quick reference any more. I'm attempting to have a pydoc api call that gets a single item or sub-item and format it to a desired output so it can be included in other content. That's makes it possible for the full docs (not necessarily pythons) to embed pydoc output in it if it's desirable. This will need pydoc formatters for the target document type. I hope to include a reST output formatter for pydoc. The help() function is imported from pydoc by site.py when you start python. It may not be difficult to have it as a function that first tries pydoc to get a request, and if the original request is returned unchanged, tries to get information from the full documentation. There could be a way to select one or the other, (or both). But this feature doesn't need to be built into pydoc, or the full documentation. They just need to be able to work together so things like this are possible in an easy to write 4 or 5 line function. (give or take a few lines) So it looks like most of these issues are more a matter of how to organize the interfaces. It turns out that what I've done with pydoc, and what Georg is doing with the main documentation should work together quite nicely. Cheers, Ron

Ron Adam schrieb: > Nick Craig-Wood wrote: > > On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote: > >> Nick Craig-Wood wrote: > >>> So I'll be able to read the main docs for a module in a terminal > >>> without reaching for the web browser (or info)? That would be great! > >>> > >>> How would pydoc decide which bit of docs it is going to show? > >> Pydoc currently gets topic info for some items by scraping the text from > >> document 'local' web pages. This is kind of messy for a couple of reasons. > >> - The documents may not be installed locally. > >> - It can be problematic locating the docs even if they are installed. > >> - They are not formatted well after they are retrieved. > >> > >> I think this is an area for improvement. > > > > And it would be improved by converting the docs to reST I imagine. > > Yes, this will need a reST to html converter for displaying in the html > browser. DocUtils provides that, but it's not part of the library. (?) > > And for console text output, is the unmodified reST suitable, or would it > be desired to modify it in some way? A text writer for docutils should not be hard to write. You'd get something that looks like the reST, but stripped of markup that makes no sense when viewed on a terminal, such as :class:`xyz`. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

This subject is generating a lot of discussion and [almost entirely] positive feedback. It would be a great shame to run out of steam. Does it need a PEP to see a chance of it getting accepted as the formal documentation system? (or a pronouncement that it will never happen...) Michael Foord Georg Brandl wrote: > Ron Adam schrieb: > >> Nick Craig-Wood wrote: >> > On Wed, May 23, 2007 at 12:46:50PM -0500, Ron Adam wrote: >> >> Nick Craig-Wood wrote: >> >>> So I'll be able to read the main docs for a module in a terminal >> >>> without reaching for the web browser (or info)? That would be great! >> >>> >> >>> How would pydoc decide which bit of docs it is going to show? >> >> Pydoc currently gets topic info for some items by scraping the text from >> >> document 'local' web pages. This is kind of messy for a couple of reasons. >> >> - The documents may not be installed locally. >> >> - It can be problematic locating the docs even if they are installed. >> >> - They are not formatted well after they are retrieved. >> >> >> >> I think this is an area for improvement. >> > >> > And it would be improved by converting the docs to reST I imagine. >> >> Yes, this will need a reST to html converter for displaying in the html >> browser. DocUtils provides that, but it's not part of the library. (?) >> >> And for console text output, is the unmodified reST suitable, or would it >> be desired to modify it in some way? >> > > A text writer for docutils should not be hard to write. You'd get something that > looks like the reST, but stripped of markup that makes no sense when viewed on > a terminal, such as :class:`xyz`. > > Georg > >

Michael Foord schrieb:
No. First of all, it needs a dedicated developer (preferably, but not necessarily a committer) who indicates willingness to maintain that for the coming years and releases. It might be that Fred Drake's offer to maintain the documentation would be still valid after such a switch, but we should not assume so without explicit confirmation. It might be that this would be the time to pass one documentation maintenance to somebody else (and I seriously do not have any one particular in mind here). Then, I think a should be made where the documentation is converted. Again, a volunteer would be needed to create the branch, and then eventually merge it back to the trunk. It might be helpful, but isn't strictly necessary, to close all documentation patches before doing so, as they all break with the conversion. For that activity, multiple volunteers would be useful. I don't think a formal document needs to be written, unless there is a hint of disagreement within the community. In that case, a process PEP would be necessary. However, it is much more important that the documentation maintainer explicitly agrees than that nobody explicitly disagrees, or that a pronouncement is made - the pronouncement alone will *not* cause this change to be carried out. Regards, Martin

Martin v. Löwis schrieb:
Assuming that Fred goes into well-earned retirement from the doc maintainer position (private mail exchange hinted that way), and nobody more qualified steps up, I'd be available to take that post. (If someone else wants to take maintainership of the content, very good, I'd have to be maintainer of the build tools anyway.) I'd then try to form a doc maintaining team, just as the PEP editor team that was created recently, to deal with the (hopefully relatively large ;) ) amount of comments and edit requests.
I agree. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Thu, May 24, 2007 at 12:43:18PM -0500, Ron Adam wrote:
And for console text output, is the unmodified reST suitable, or would it be desired to modify it in some way?
Currently pydoc output looks like a man page under Unix. if it could look like that then that would be great. Otherwise raw reST is fine!
I think the latter proposal sounds like the correct one. In debian for instance, the python docs are a seperate package, and it would seem reasonable that you'd have to have that package installed to get the long docs.
If you look at the documentation for subprocess for instance, you'll see that the docstring is pretty much the same as the library reference documentation which seems like needless duplication and opportunity for code/doc skew. Maybe one is auto generated from the other - I don't know!
Well you could stop after reading the short bit!
Sounds good! Nick -- Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

Nick Craig-Wood schrieb:
Okay, there's now support for SmartyPants in Subversion -- it converts these quotes as well as triple dashes to their pretty equivalents. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Sat, May 19, 2007 at 07:14:09PM +0200, Georg Brandl wrote:
For the impatient: the result can be seen at <http://pydoc.gbrandl.de>.
I think that looks great. One comment I have, I don't know if it's relevant - it perhaps depends on whether the "Global Module Index" is auto-generated or not. This is the page I visit the most out of all the Python documentation, and it's far too large and unwieldy. IMHO it would be much better if only the top-level modules were shown here - having the single package 'distutils', for example, take up nearly 50 entries in the list is almost certainly hindering a lot more people than it helps. It would perhaps be better if such packages show up as one entry, which shows the sub-modules when clicked on.

>> One comment I have, I don't know if it's relevant - it perhaps >> depends on whether the "Global Module Index" is auto-generated or >> not. This is the page I visit the most out of all the Python >> documentation, and it's far too large and unwieldy. IMHO it would be >> much better if only the top-level modules were shown here - having >> the single package 'distutils', for example, take up nearly 50 >> entries in the list is almost certainly hindering a lot more people >> than it helps. It would perhaps be better if such packages show up as >> one entry, which shows the sub-modules when clicked on. Georg> Sure, that is certainly possible. Take a look at <http://www.webfast.com/modindex/>. It records request counts for the various pages and presents the most frequently requested pages in a section at the top of the page. I can make the script available if anyone wants it (it uses Myghty - Mason in Python.) Skip

On 5/21/07, skip@pobox.com <skip@pobox.com> wrote:
+1 for integrating this with the official docs. I loved this the last time you posted it too. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On Mon, May 21, 2007, Jon Ribbens wrote:
That's a good point in general, but I think we want to manually label some submodules as having entries in the global module index (notably os.path). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Look, it's your affair if you want to play with five people, but don't go calling it doubles." --John Cleese anticipates Usenet

Hoi, Additionally to the offline docs that Georg published some days ago there is also a web version which currently looks and works pretty much like the offline version. There are however some differences that are worth knowing: - Cleaner URLs. You can actually guess them because we took the idea the PHP people had and check for similar pages if a page does not exist. We do however redirect if there was a match so that the URL stays unique. - The search doesn't require JavaScript (but is currently disabled due to a buggy stemmer and indexer) That's it for now, you can try it online at http://pydoc.gbrandl.de:3000/ Regards, Armin


On 5/23/07, Georg Brandl <g.brandl@gmx.net> wrote:
Also, try
Beautiful! STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Am Wed, 23 May 2007 08:30:17 +0200 schrieb Georg Brandl <g.brandl@gmx.net>:
Looks good. But should the source pages really use syntax highlighting? I think if somebody is interested in the source then they should get the real source without any highlighting. If you decide to keep the syntax highlighting then the highlighting of multiline ReST strings should be fixed. For example see the source for splitext(). Thanks for the work, Dennis Benzinger

Hoi, Due to some server issues I had to take the web version down. But expect an updated version in a few days. Regards, Armin

On 5/19/07, Georg Brandl <g.brandl@gmx.net> wrote:
[SNIP]
Waiting for comments!
Here a small suggestion, move the sidebar to the right. Moving it to the right makes it much less intrusive. See that by yourself: http://peadrop.com/files/pydoc-sidebar-right.png div.body { background-color:white; margin:0pt 190pt 0pt 0px; } div.sidebar { float:right; margin-left:-100%; width:230px; } Keep up the great work, -- Alexandre

Hi, We managed to get an up to date version of the web version of the docs running on the server. The address is still the same (http://pydoc.gbrandl.de:3000) and it's also still running on top of wsgiref. Changes so far: * comments: each page that is generated from an rst file can have some comments attached to it. Commenting doesn't require registration at the moment. * antispam with optional reverse captcha (captcha for bots, a hidden input field named "homepage" which bots hopefully fill out, dumb as they are) and a regular expression filter rules based on MoinMoin's BadContent file. * administration panel for moderating comments. You can find the admin panel at http://pydoc.gbrandl.de:3000/admin/ -- login credentials are testuser:password) * feeds for comments on a page or the last n comments on the whole site. * source view is text only (again). What still works: * intelligent error pages: if a page does not exist the URL path is used to conduct a fuzzy keyword search (see below). * fuzzy keyword search: "os.path.exists" jumps to the entry, "os.paht.exists" shows some possibilities. What needs to be implemented: * full text search * proposing documentation patches Note that the comment area is really, really dark, that's intentional. This is meant to visually separate comments from the official docs, but if the constrast is deemed to unsettling, another way can be found. Also, we're experimenting with alternate stylesheets, e.g. placing the sidebar on the right of the main text, or a "traditional" style for those liking the original docs' style. In any case, we're waiting for your input! cheers, Georg and Armin -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Yes, the comments are a bit too dark. The separation could be done better by moving it below the footer. Or better yet, duplicate the navigation bar between the the document page and the comments. ------------------------------------------------ crumbs navagation ------------------------------------------------ side | main page bar | | | ------------------------------------------------ crumbs navagation ------------------------------------------------ User Comment section ------------------------------------------------ copy right ------------------------------------------------ The user comment section could have it's own side bar if that's desirable. Also the python version information needs to be on every page someplace. Ron
participants (42)
-
"Martin v. Löwis"
-
A.M. Kuchling
-
Aahz
-
Alexandre Vassalotti
-
Armin Ronacher
-
Barry Warsaw
-
Bill Janssen
-
BJörn Lindqvist
-
Bob Ippolito
-
Brett Cannon
-
Dennis Benzinger
-
Dustin J. Mitchell
-
Fred L. Drake, Jr.
-
Gael Varoquaux
-
Georg Brandl
-
Greg Ewing
-
John Gabriele
-
Jon Ribbens
-
Josiah Carlson
-
Ka-Ping Yee
-
Laurent Gautier
-
Lea Wiemann
-
Lea Wiemann
-
Martin Blais
-
Michael Foord
-
Neal Becker
-
Nick Coghlan
-
Nick Craig-Wood
-
nick@craig-wood.com
-
Raymond Hettinger
-
Robert Kern
-
Ron Adam
-
Ronald Oussoren
-
Scott Dial
-
skip@pobox.com
-
Stephen J. Turnbull
-
Steve Holden
-
Steven Bethard
-
Talin
-
Terry Reedy
-
Titus Brown
-
Vinay Sajip