From goodger@users.sourceforge.net  Tue Oct  1 05:25:43 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 01 Oct 2002 00:25:43 -0400
Subject: [Doc-SIG] Docstring Standards
In-Reply-To: <638AA0336D7ED411928700D0B7B0D75B02E1F012@aimail.aiinet.com>
Message-ID: <B9BE9C06.29A42%goodger@users.sourceforge.net>

Mahrt, Dallas wrote:
> Background: I am in the process of defining some internal
> programming standards for my company. One aspect we are keenly
> interested in is defining the docstring syntax such that we can
> facilitate documentation generation (similar to Doxygen or JavaDoc
> which we use currently) I have read about the docutils project and
> feel it is probably the best fit (in the long term) for our needs,
> however I have a question.

"In the long term" is important here, because (as you know from our
previous correspondence) Docutils doesn't have docstring extraction,
**yet**.  It will, and hopefully soon, but that depends on the time
and effort of volunteers.  (Not just me, hopefully.  It looks like
Richard Jones is dabbling in the pysource sandbox today, which is
encouraging.)

> 1) Method signatures.
> In Doxygen and JavaDoc, there is an explicit ''@param'' syntax for
> defining the documentation related to a parameter. There are similar
> constructs for ''@exception'' and ''@return''.

What JavaDoc has done is establish a syntax that enables a certain
documentation methodology, or standard *semantics*.  JavaDoc is not just
syntax; it prescribes a methodology.  I began a to
document some ideas about semantics, available here:

    http://docutils.sf.net/spec/semantics.html

I haven't explored documentation methodology more because, in my
opinion, it is a completely separate issue from syntax, and it's even
more controversial than syntax.  Nobody wants to be told how to lay
out their documentation, a la JavaDoc.  I think the JavaDoc way is
butt-ugly, but it *is* an established standard for the Java world.
Any standard documentation methodology has to be formal enough to be
useful but remain light enough to be usable.  If the methodology is
too strict, too heavy, or too ugly, many/most will not want to use it.

One thing I've experimented with is expressed in the above document
thus:

    Use field lists or definition lists for "tagged blocks".

By this I mean that field lists can be used similarly to JavaDoc's
@tag syntax.  That's actually one of the motivators behind field
lists.  For example, we could have::

    """
    :Parameters:
        - `lines`: a list of one-line strings without newlines.
        - `until_blank`: Stop collecting at the first blank line if
          true (1).
        - `strip_indent`: Strip common leading indent if true (1,
          default).

    :Return:
        - a list of indented lines with mininum indent removed;
        - the amount of the indent;
        - whether or not the block finished with a blank line or at
          the end of `lines`.
    """

In fact, this is taken straight out of docutils/statemachine.py, in
which I experimented with a simple documentation methodology.  Another
variation I've thought of exploits the Grouch_-compatible "classifier"
element of definition lists.  For example::

    :Parameters:
        `lines` : [string]
            List of one-line strings without newlines.
        `until_blank` : boolean
            Stop collecting at the first blank line if true (1).
        `strip_indent` : boolean
            Strip common leading indent if true (1, default).

.. _Grouch: http://www.mems-exchange.org/software/grouch/

Field lists could even be used in a one-to-one correspondence with
JavaDoc @tags, although I don't know if I'd recommend it.  The entire
question of methodology requires more serious thought than I can
afford at present.  I think a standard methodology would benefit the
Python community, but it would be a hard sell.  A PEP would be the
place to start.

> The only thing similar I have found in docutils documentations is
> the use of explicit roles. Ex.
> 
>     def foo(bar):
>         """This is foo.
> 
>         :parameter"`foo` - This is a foo
>         """

I think this should be::

    :parameter:`foo`

I don't think I'd use that syntax (interpreted text with explicit
roles) because it's very verbose and cumbersome.  Interpreted text has
syntax in reStructuredText but it hasn't really been implemented yet,
and may be rethought if something better shows up.  It's there in
anticipation of future need (yes, I know, not the XP way), especially
for Python docstring extraction.

> Is this the *standard* way of documenting parameters?

No.

> If not is there a standard?

No.  There have been attempts though.  Several ports of JavaDoc's @tag
methodology exist in Python, most recently Ed Loper's "epydoc_".
There's Frederic Giacometti's `iPhrase Python documentation
conventions`_.  I'm sure there've been others.

.. _epydoc: http://epydoc.sf.net/
.. _iPhrase Python documentation conventions:
   http://mail.python.org/pipermail/doc-sig/2001-May/001840.html

> If so is there a similar concept for raised exceptions and return
> values? 

Easy enough to do.  See the docstring sample above.

> The only thing I've noticed in practice are English descriptions with
> literal references. Ex.
> 
>     def foo(bar):
>         """This is foo.
> 
>         Passes `foo` which is a foo.
>         """
> 
> This seems to be more difficult to extract the description from the
> identifier for a more tabular representation (like JavaDoc)

Agreed.  However, for most human-readable documentation needs, the
free-form text approach is adequate.  You'd only need a formal
methodology if you want to extract the parameters into a data
dictionary, index, or summary of some kind.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From edloper@gradient.cis.upenn.edu  Wed Oct  2 04:40:34 2002
From: edloper@gradient.cis.upenn.edu (Edward Loper)
Date: Tue, 01 Oct 2002 23:40:34 -0400
Subject: [Doc-SIG] list of tools for the doc-sig page
Message-ID: <3D9A6AB2.3060308@gradient.cis.upenn.edu>

I think that the python doc-sig page <http://www.python.org/sigs/ 
doc-sig/> should include a list of the tools that are currently 
available for automatically extracting API documentation (pydoc, 
HappyDoc, docutils, etc).  There are 11 tools that I know of, but there 
might be more.  I wrote up a table summarizing the tools that I know of, 
so it should be possible to just copy/paste the HTML to the doc-sig page 
(who maintains that page?).  Or if people think that my table should be 
changed, or that the information would be better presented as a list of 
descriptions, etc., then we could do that..  The table that I put 
together is (temporarily) available at:

     <http://www.cis.upenn.edu/~edloper/python_api.html>

I'll also include a slightly reduced/summarized text version of the 
table below, in case that's more convenient for some people.

Also, if any of the information I listed is incorrect/incomplete, or if 
I left out any tools that you know of, please let me know.  I think we 
should try to make this list as complete as possible.

-Edward

===========================================================================
Tool     |Markup          |Output    |License         |Status      |Notes
          |Language(s)     |Format(s) |                |            |
=========*================*==========*================*============*=======
Crystal  |StructuredText  |HTML      |unspecified     |unmaintained|D, P
---------+----------------+----------+----------------+------------+-------
doc.py   |doc_string      |HTML      |doc.py license  |unknown     |D, P
---------+----------------+----------+----------------+------------+-------
Docutils |reStructuredText|(none yet)|Python License  |under       |D, P
          |                |          |                |construction|
---------+----------------+----------+----------------+------------+-------
Easydoc  |Javadoc-like    |HTML      |GPL             |stable      |D, P
---------+----------------+----------+----------------+------------+-------
epydoc   |epytext         |HTML      |IBM License     |stable      |D, I
---------+----------------+----------+----------------+------------+-------
gendoc   |StructuredText  |HTML      |gendoc license  |unknown     |D, P
          |                |plaintext |                |            |
          |                |MIF/MML   |                |            |
---------+----------------+----------+----------------+------------+-------
HappyDoc |StructuredTextNG|HTML      |HappyDoc License|stable      |C, D, P
          |StructuredText  |DocBook   |                |            |
          |plaintext       |Dia       |                |            |
          |raw             |          |                |            |
---------+----------------+----------+----------------+------------+-------
pydoc    |plaintext       |HTML      |Python License  |stable      |D, I
          |                |man       |                |            |
---------+----------------+----------+----------------+------------+-------
Pythondoc|StructuredText  |HTML      |Pythodoc License|beta        |D, I
          |                |XML       |                |            |
---------+----------------+----------+----------------+------------+-------
Teud     |plaintext       |HTML      |Teud License    |unknown     |D, I
          |                |XML       |                |            |
---------+----------------+----------+----------------+------------+-------
XIST     |XML             |HTML      |Python License  |unknown     |D, I
===========================================================================

Key:
   Tool: The name of the API documentation generation tool. Link is to
       the tool's homepage.
   Markup Langauge(s): The markup language(s) that can be used within
       docstrings. Links are to the markup languages' definitions.
   Output Format(s): The type(s) of output that the tool can
       produce. Links are to examples of the output produced by the
       tool for each format.
   License: The license that the tool is distributed under. Link is to
       the license itself.
   Status: The current status of the tool.
   Notes:
     C: Documentation is generated from comments.
     D: Documentation is generated from docstrings.
     P: Documentation is generated by parsing Python files.
     I: Documentation is generated by introspection.


From goodger@users.sourceforge.net  Wed Oct  2 04:39:55 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 01 Oct 2002 23:39:55 -0400
Subject: [Doc-SIG] Updates to Docutils
Message-ID: <B9BFE2CB.29BA2%goodger@users.sourceforge.net>

I have just completed the integration of Dethe Elza's refactoring of
the reStructuredText directive API.  A summary is in the module
docstring of docutils/parsers/rst/directives/__init__.py, and complete
details are in the new "Creating reStructuredText Directives" How-To
document (http://docutils.sf.net/spec/howto/rst-directives.html).
Many thanks to Dethe for initiating this refactoring and writing the
initial How-To; it has simplified directive implementation
considerably.

Ramifications of this change:

1. The minimum required Python version is now 2.1 (was 2.0), with
   2.1.3 or 2.2.1 recommended.  The reason for the change is that
   directive functions now employ function attributes, a feature
   introduced in Python 2.1.

2. Any directives that aren't part of the reStructuredText parser
   (e.g. 3rd party patches) will have to be revised, although I'm not
   aware of any.  If anybody has written useful directives, please
   consider contributing them to the Docutils project.

Three new directives (also courtesy of Dethe) have been added to the
parser:

* "include": Including an external document fragment.

* "raw": Raw data pass-through, such as raw HTML.

* "replace": Text substitutions (only valid inside substitution
  definitions).

See http://docutils.sf.net/spec/rst/directives.html for details of the
new directives.

Get the latest snapshot here:

    http://docutils.sf.net/docutils-snapshot.tgz

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From goodger@users.sourceforge.net  Wed Oct  2 05:48:32 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 02 Oct 2002 00:48:32 -0400
Subject: [Doc-SIG] list of tools for the doc-sig page
In-Reply-To: <3D9A6AB2.3060308@gradient.cis.upenn.edu>
Message-ID: <B9BFF2DF.29BAC%goodger@users.sourceforge.net>

Edward Loper wrote:
> I think that the python doc-sig page <http://www.python.org/sigs/
> doc-sig/> should include a list of the tools that are currently
> available for automatically extracting API documentation (pydoc,
> HappyDoc, docutils, etc).

I agree.  Coincidentally, I've been working on updating the Doc-SIG pages.
Fred Drake okayed an initial draft, although I've revised it since and
(unless Fred says not to bother; Fred?) I'll run it by him again.  I've
added Docutils and epydoc, and removed some outdated material.  The "PSA as
a Catalyst" section (status page) is obsolete; Fred suggested it be ripped
out, and I agree.  The "Continuing questions" section should also be either
ripped out or reworked; I think most of the questions are no longer
current/continuing.  Given that, and the amount of duplication, I think the
home page (http://www.python.org/sigs/doc-sig/index.html) and status page
(http://www.python.org/sigs/doc-sig/status.html) ought to be merged.

> I wrote up a table summarizing the tools that I know of,

Looks good.  I think it would be a fine addition to the Doc-SIG page.

> (who maintains that page?).

Nobody has been maintaining it of late.  I get the impression from Fred that
he'd be happy if anyone took over.  I have Python CVS access now, so I could
coordinate at least.

> I'll also include a slightly reduced/summarized text version of the
> table below, in case that's more convenient for some people.

Does epytext handle *that*?  ;-)

> Also, if any of the information I listed is incorrect/incomplete, or if
> I left out any tools that you know of, please let me know.

Some corrections to the Docutils entry: HTML, XML output formats, with
LaTeX, DocBook, and PDF on the way (with a caveat); most of the code is
public domain, with some Python license, some other OSI-approved (details in
http://docutils.sf.net/COPYING.html).  The caveat is that the docstring
extraction part is very much under construction (although it seems that
Richard Jones has been scratching an itch; gotta take a look at what he's up
to).

> I think we should try to make this list as complete as possible.

Sounds good.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From edloper@gradient.cis.upenn.edu  Wed Oct  2 06:19:04 2002
From: edloper@gradient.cis.upenn.edu (Edward Loper)
Date: Wed, 02 Oct 2002 01:19:04 -0400
Subject: [Doc-SIG] list of tools for the doc-sig page
References: <B9BFF2DF.29BAC%goodger@users.sourceforge.net>
Message-ID: <3D9A81C8.4040803@gradient.cis.upenn.edu>

> I agree.  Coincidentally, I've been working on updating the Doc-SIG pages.
Sounds good.  Any idea when the new page will be up?

> Given that, and the amount of duplication, I think the
> home page (http://www.python.org/sigs/doc-sig/index.html) and status page
> (http://www.python.org/sigs/doc-sig/status.html) ought to be merged.
Agreed.

>>I'll also include a slightly reduced/summarized text version of the
>>table below, in case that's more convenient for some people.
> 
> Does epytext handle *that*?  ;-)
Epytext doesn't do tables at all.  But "links -dump" is a very good 
thing. ;)

> Some corrections to the Docutils entry: HTML, XML output formats, with
> LaTeX, DocBook, and PDF on the way (with a caveat); 
I was trying to list features that are currently available; that's why I 
didn't include docbook etc. for docutils, and didn't add "I" (generates 
docs via introspection) under notes.

Are HTML and XML output available for producing *API documentation*, or 
just for reStructuredText?  If it's available for producing API 
documentation, then you should definitely include it (and I'd be 
interested to see what it looks like).

> most of the code is
> public domain, with some Python license, some other OSI-approved (details in
> http://docutils.sf.net/COPYING.html).  The caveat is that the docstring
> extraction part is very much under construction (although it seems that
> Richard Jones has been scratching an itch; gotta take a look at what he's up
> to).

Since you'll probably be the one adding it to the doc-sig page, feel 
free to make whatever changes seem appropriate.  Since the docutils 
licensing is somewhat complex, you should probably just say "docutils 
license" or something, and link to COPYING.html.  Also, if you're 
feeling motivated, you could extract the gendoc/pydoc/teud licenses from 
their respective packages, add a .html file for each one, and link to 
them (those are the only packages whose licenses aren't directly available).

-Edward


From goodger@users.sourceforge.net  Wed Oct  2 06:30:10 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 02 Oct 2002 01:30:10 -0400
Subject: [Doc-SIG] list of tools for the doc-sig page
In-Reply-To: <3D9A81C8.4040803@gradient.cis.upenn.edu>
Message-ID: <B9BFFCA1.29BD3%goodger@users.sourceforge.net>

[David]
>> I agree.  Coincidentally, I've been working on updating the Doc-SIG pages.

[Edward]
> Sounds good.  Any idea when the new page will be up?

"Any day now."  Which day that will be, I can't say.

>> Some corrections to the Docutils entry: HTML, XML output formats, with
>> LaTeX, DocBook, and PDF on the way (with a caveat);
> I was trying to list features that are currently available

If that's the case, then apart from the "pysource" project under development
in the sandbox, Docutils currently has *no* API documentation feature
available at all.  It's under construction.  I'd say the "Status: under
construction" entry covers it nicely.

> Are HTML and XML output available for producing *API documentation*, or
> just for reStructuredText?

Once the API documentation componenent is there, all existing output formats
should just work.  My point is that the output format is not tied to the
input processing.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From tony@lsl.co.uk  Wed Oct  2 09:16:19 2002
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 2 Oct 2002 09:16:19 +0100
Subject: Pysource (was RE: [Doc-SIG] Docstring Standards)
In-Reply-To: <B9BE9C06.29A42%goodger@users.sourceforge.net>
Message-ID: <011101c269eb$fc90e320$545aa8c0@lslp862.int.lsl.co.uk>

David Goodger wrote:
> It looks like Richard Jones is dabbling in the pysource sandbox
> today, which is encouraging.

Yeh!

Yes, I see - I hope he feels free to improoooove things properly as he
goes.

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Wed Oct  2 09:33:36 2002
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 2 Oct 2002 09:33:36 +0100
Subject: [Doc-SIG] Docstring Standards
In-Reply-To: <B9BE9C06.29A42%goodger@users.sourceforge.net>
Message-ID: <011201c269ee$66723990$545aa8c0@lslp862.int.lsl.co.uk>

Hmm. I'm not *entirely* convinced that formal semantics for how to
present argument list documentation, etc., is a Good Thing, even though
I believe I've argued for "data mining" of docstrings in the past.

On the "for" side, the work I'm currently doing on embedding Java in our
product, and automatically wrapping pre-existing C functions as Java
methods, would be a lot harder if we had *not* had a fairly formal
requirement on C function documentation (including documenting if each
argument is in, out or both). Of course, that same requirement also
means a lot of redundant text in the documentation produced.

However, on the "against" side, looking at javadoc guidelines seems (to
me) to show a tendency to ask for over-documentation of parameters,
return values, etc. - rather on the grounds that the "interface" is all
one can guarantee, so one should document it. This seems, in extremis
(and a very easily reached extreme!) to lead to documenting all
parameters regardless of whether it's easy to tell (from their name, or
the "free form" documentation for a method) what they mean. Ditto for
return values and exceptions.

However, in Emacs Lisp I never felt the need to do such things (there,
the tendency to CAPITALISE arguments makes them stand out in normal
text, and that may help). And whilst I suspect I've tended to go for the
more "formal" approach of lists of parameters, etc., in Python, I think
that's mostly a hold-over from my C, and an approach I'm less likely to
follow now, unless it makes textual sense in that particular docstring
[1]_.

I suspect that the *correct* thing to do is to require well-written
documentation, that explains what a thing does, to sufficient detail,
and police that practice in exactly the same way that the code itself is
looked after, and to exactly the same degree of seriousness. Of course,
that *is* harder to mandate in a formal document or coding standard
("write good code" as a coding standard only has the virtue of being
short...)

.. [1] An obvious example in Python would be where arguments *did*
   have particular restrictions (e.g., datatypes) on them, which
   are not obvious from other indicators. But I would assert that
   if this is so, then the failure to write the documentation like
   that would be picked up as "this is not well written".

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From Juergen Hermann" <jh@web.de  Wed Oct  2 12:48:45 2002
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Wed, 02 Oct 2002 13:48:45 +0200
Subject: [Doc-SIG] list of tools for the doc-sig page
In-Reply-To: <B9BFF2DF.29BAC%goodger@users.sourceforge.net>
Message-ID: <E17whzh-0001KD-00@smtp.web.de>

On Wed, 02 Oct 2002 00:48:32 -0400, David Goodger wrote:

>> (who maintains that page?).
>
>Nobody has been maintaining it of late.  I get the impression from Fred=
 that
>he'd be happy if anyone took over.  I have Python CVS access now, so I =
could
>coordinate at least.

Even better would be to add to the sigs page a link into the wiki, just =
like 
with http://www.python.org/cgi-bin/moinmoin/PythonEditors. So then we al=
l can 
maintain the content.


Ciao, J=FCrgen

--
J=FCrgen Hermann, Developer
WEB.DE AG, http://webde-ag.de/


From walter@livinglogic.de  Wed Oct  2 13:11:37 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 02 Oct 2002 14:11:37 +0200
Subject: [Doc-SIG] list of tools for the doc-sig page
References: <3D9A6AB2.3060308@gradient.cis.upenn.edu>
Message-ID: <3D9AE279.9050200@livinglogic.de>

Edward Loper wrote:

> ---------+----------------+----------+----------------+------------+-------
> XIST     |XML             |HTML      |Python License  |unknown     |D, I
> ===========================================================================

The link should probably be http://www.livinglogic.de/Python/xist/
instead of http://www.livinglogic.de/Python/xist/index.html

Plain text is supported as an output format, this is done
by converting to a special HTML version and piping the
result through w3m.

Bye,
    Walter D�rwald


From fdrake@acm.org  Mon Oct  7 19:12:48 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 7 Oct 2002 14:12:48 -0400
Subject: [Doc-SIG] Python 2.2.2 beta 1 docs finalized
Message-ID: <15777.52896.164933.656370@grendel.zope.com>

I've posted the Python 2.2.2b1 docs online; please report any (new)
problems via SourceForge with a priority of at least 6.

I'd like to thank Raymond for putting so much effort into the
documentation for 2.2.2; he's really doing a great job!

Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mwh@python.net  Thu Oct 10 20:12:37 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Oct 2002 20:12:37 +0100
Subject: [Doc-SIG] simple use of docutils
Message-ID: <2mit0aglay.fsf@starship.python.net>

I have an application where I want to programmatically use docutils.

So, I have a chuck of text (or a file, both are equally easy) in rst
format that I want to turn into html (again, either a file, or as
text).

This doesn't seem to be totally straightforward.

I have code like this:

    from docutils.core import Publisher
    from docutils.io import FileInput

    pub = Publisher(source=FileInput(None, source_path=filename),
                    destination=htname)
    pub.set_options()
    pub.set_reader('standalone', None, 'restructuredtext')
    pub.set_writer('html')
    pub.publish()

But that's not enough -- I have to cook up mythical 'option' objects
from somewhere.

Clearly, I haven't really read the docs.

Is there an easy way of doing what I want?

There should be.

Cheers,
M.

-- 
  Sufficiently advanced political correctness is indistinguishable
  from irony.                                           -- Erik Naggum


From goodger@users.sourceforge.net  Fri Oct 11 04:08:41 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 10 Oct 2002 23:08:41 -0400
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: <2mit0aglay.fsf@starship.python.net>
Message-ID: <B9CBB8F8.2A348%goodger@users.sourceforge.net>

Michael Hudson wrote:
> I have an application where I want to programmatically use docutils.
> 
> So, I have a chuck of text (or a file, both are equally easy) in rst
> format that I want to turn into html (again, either a file, or as
> text).
> 
> This doesn't seem to be totally straightforward.

Questions and constructive suggestions are always appreciated.

> Clearly, I haven't really read the docs.

That's understandable, since there are none yet for what you're doing.
But you must have read the source to get this far.  Look at the
docstrings of the classes you're using and their __init__ methods for
details.  I've employed "attribute docstrings" liberally, in
anticipation of future tool support.

> I have code like this:
> 
>     from docutils.core import Publisher
>     from docutils.io import FileInput
> 
>     pub = Publisher(source=FileInput(None, source_path=filename),
>                     destination=htname)

What is "htname"?  The "destination" parameter has to be a
docutils.io.Output instance, such as FileOutput or StringOutput.

>     pub.set_options()
>     pub.set_reader('standalone', None, 'restructuredtext')
>     pub.set_writer('html')
>     pub.publish()
> 
> But that's not enough -- I have to cook up mythical 'option' objects
> from somewhere.

Just replace "pub.set_options()" with::

    options = pub.set_options()

"Publisher.set_options()" sets *and* returns the option values object.
You'll need to pass the options object to the I/O instantiators.  This
code should work (assuming you want a string return value)::

    from docutils.core import Publisher
    from docutils.io import FileInput, StringOutput

    pub = Publisher()
    options = pub.set_options()
    pub.source = FileInput(options, source_path=filename)
    pub.destination = StringOutput(options)
    pub.set_reader('standalone', None, 'restructuredtext')
    pub.set_writer('html')
    output = pub.publish()

> Is there an easy way of doing what I want?

The "docutils.core.publish()" convenience function is an easy way to
get file-to-file processing.  The to-do list has an entry for a
string-to-string processing convenience function, which I can whip up
quickly if it will help.  But I can't make one for every possible
combination; that's up to the developer.

> There should be.

The standard disclaimer applies: if something isn't there, it's either
because nobody has needed it yet or because nobody has had time to add
it yet.  Patches are welcome!

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From mwh@python.net  Fri Oct 11 11:23:05 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Oct 2002 11:23:05 +0100
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: David Goodger's message of "Thu, 10 Oct 2002 23:08:41 -0400"
References: <B9CBB8F8.2A348%goodger@users.sourceforge.net>
Message-ID: <2m3crd46ly.fsf@starship.python.net>

David Goodger <goodger@users.sourceforge.net> writes:

> Michael Hudson wrote:
> > I have an application where I want to programmatically use docutils.
> > 
> > So, I have a chuck of text (or a file, both are equally easy) in rst
> > format that I want to turn into html (again, either a file, or as
> > text).
> > 
> > This doesn't seem to be totally straightforward.
> 
> Questions and constructive suggestions are always appreciated.

First off, let me apologise for being a bit peeved when writing last
night's mail.  It wasn't all docutils fault :)

I realise this is a work in progress, etc.  I hope I'll be able to
supply some patches as well as just grousing.

> > Clearly, I haven't really read the docs.
> 
> That's understandable, since there are none yet for what you're doing.
> But you must have read the source to get this far.  Look at the
> docstrings of the classes you're using and their __init__ methods for
> details.  I've employed "attribute docstrings" liberally, in
> anticipation of future tool support.

I'm afraid the docstrings slightly suffer from the problem of only
making sense if you already understand what's going on.

I guess I'm missing the "big picture", and docstrings aren't really
the place to get that.

> > I have code like this:
> > 
> >     from docutils.core import Publisher
> >     from docutils.io import FileInput
> > 
> >     pub = Publisher(source=FileInput(None, source_path=filename),
> >                     destination=htname)
> 
> What is "htname"?  The "destination" parameter has to be a
> docutils.io.Output instance, such as FileOutput or StringOutput.

Yeah, I realised that I'd need to fiddle that sooner or later, but I
hadn't got that far...

But, here's a real point: WHY?

WHY can't I just pass a filename or a file-like object in as
destination?  Having to wrap things up in a layer of library specific
classes rubs me up the wrong way.

> >     pub.set_options()
> >     pub.set_reader('standalone', None, 'restructuredtext')
> >     pub.set_writer('html')
> >     pub.publish()
> > 
> > But that's not enough -- I have to cook up mythical 'option' objects
> > from somewhere.
> 
> Just replace "pub.set_options()" with::
> 
>     options = pub.set_options()

Ah!  OK.

> "Publisher.set_options()" sets *and* returns the option values object.

This is not an intuitive interface, to my mind.

> You'll need to pass the options object to the I/O instantiators.  This
> code should work (assuming you want a string return value)::
> 
>     from docutils.core import Publisher
>     from docutils.io import FileInput, StringOutput
> 
>     pub = Publisher()
>     options = pub.set_options()
>     pub.source = FileInput(options, source_path=filename)
>     pub.destination = StringOutput(options)
>     pub.set_reader('standalone', None, 'restructuredtext')
>     pub.set_writer('html')
>     output = pub.publish()

Thanks, I'll give it a try momentarily.

> > Is there an easy way of doing what I want?
> 
> The "docutils.core.publish()" convenience function is an easy way to
> get file-to-file processing.  The to-do list has an entry for a
> string-to-string processing convenience function, which I can whip up
> quickly if it will help.  But I can't make one for every possible
> combination; that's up to the developer.

Of course.

I know I'm going to have to get a bit more cosy with the library at
some point: I don't really want to go .rst file -> .html file, I want
to go chunk of restructured text -> chunk of html, and I'll want to
know what the heading of the chunk of restructured text actually was.

> > There should be.

[aside: this should be read as "there should be, at some point", not
"there should be, already"]

> The standard disclaimer applies: if something isn't there, it's either
> because nobody has needed it yet or because nobody has had time to add
> it yet.  Patches are welcome!

I have a couple of train journeys this weekend; I'll see what I can
come up with.

Cheers,
M.

-- 
  I'm a keen cyclist and I stop at red lights.  Those who don't need
  hitting with a great big slapping machine.
                                           -- Colin Davidson, cam.misc


From goodger@users.sourceforge.net  Sat Oct 12 01:48:36 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 11 Oct 2002 20:48:36 -0400
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: <2m3crd46ly.fsf@starship.python.net>
Message-ID: <B9CCE9A3.2A41B%goodger@users.sourceforge.net>

Michael Hudson wrote:
> First off, let me apologise for being a bit peeved when writing last
> night's mail.  It wasn't all docutils fault :)

Not to worry, no offense taken. :-)

> I realise this is a work in progress, etc.

Indeed!  I'm currently puzzling out a re-design of the transform
mechanism, since the original/current design proved that it wasn't up
to snuff.  Also, there are many small things that crop up.  For
example, I'm considering renaming "options" to "settings".  (Effective
naming is very important to me. It helps me keep the design in my
head.)

Please consider the entire codebase and design to be extremely
experimental, subject to and welcoming of change.  Improvements are
accepted in any form from any source.  I try to distance myself from
the existing code and design, to objectively judge it against
alternatives (not always easy).

If you'd like to join the project, just let me know.  That goes for
anybody who's interested in helping with Docutils.

> I hope I'll be able to supply some patches as well as just grousing.

That would be most welcome (on both counts ;-).

> I'm afraid the docstrings slightly suffer from the problem of only
> making sense if you already understand what's going on.

I think it's hard for the developer (me) to objectively document their
own code, since they *do* already understand it.  Contributions in
this regard (questions, suggestions, actual docs and/or docstrings)
would also be welcome!

> I guess I'm missing the "big picture", and docstrings aren't really
> the place to get that.

The only "big picture" can be found in PEPs 256 & 258, especially the
latter.  They're kept up to date.

> WHY can't I just pass a filename or a file-like object in as
> destination?

I'm trying to provide a uniform interface, no matter the input source
(string, single file, multiple files in directories, Python module,
Python package) and output destination (string, single file, multiple
files), not all of which are implemented yet.  The I/O classes store
attributes of their data stores, such as paths and encodings; the I/O
classes handle the text decoding & encoding.

Having said that, if there's a simpler way then I'm all ears.

> Having to wrap things up in a layer of library specific classes rubs
> me up the wrong way.

Does the above justify it to you now?  If not, I'm open to
suggestions.  Although I think the I/O classes are a decent solution,
I'm not 100% sure they don't smell bad.  Sometimes it's hard to tell
until you've seen a better solution.

>> Just replace "pub.set_options()" with::
>> 
>>     options = pub.set_options()
> 
> Ah!  OK.
> 
>> "Publisher.set_options()" sets *and* returns the option values
>> object.
> 
> This is not an intuitive interface, to my mind.

Agreed.  Returning the value was an afterthought.  How about this
instead? ::

    pub.set_options()
    options = pub.options

>> You'll need to pass the options object to the I/O instantiators.
>> This code should work (assuming you want a string return value)::
>> 
>>     from docutils.core import Publisher
>>     from docutils.io import FileInput, StringOutput
>> 
>>     pub = Publisher()
>>     options = pub.set_options()
>>     pub.source = FileInput(options, source_path=filename)
>>     pub.destination = StringOutput(options)
>>     pub.set_reader('standalone', None, 'restructuredtext')
>>     pub.set_writer('html')
>>     output = pub.publish()
> 
> Thanks, I'll give it a try momentarily.

That code won't work.  As I wrote to Aahz, the "pub.set_reader" and
"pub.set_writer" calls have to come *before* "pub.set_options".  My
mistake, sorry.

> I know I'm going to have to get a bit more cosy with the library at
> some point:

Out of curiosity, what will you be using Docutils for?

> I don't really want to go .rst file -> .html file, I want to go
> chunk of restructured text -> chunk of html, and I'll want to know
> what the heading of the chunk of restructured text actually was.

I don't understand what you mean by "the heading" here.  Can you
explain?

> I have a couple of train journeys this weekend; I'll see what I can
> come up with.

Cool.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From mwh@python.net  Mon Oct 14 12:17:04 2002
From: mwh@python.net (Michael Hudson)
Date: 14 Oct 2002 12:17:04 +0100
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: David Goodger's message of "Fri, 11 Oct 2002 20:48:36 -0400"
References: <B9CCE9A3.2A41B%goodger@users.sourceforge.net>
Message-ID: <2mu1jpnubz.fsf@starship.python.net>

David Goodger <goodger@users.sourceforge.net> writes:

> Michael Hudson wrote:
> > First off, let me apologise for being a bit peeved when writing last
> > night's mail.  It wasn't all docutils fault :)
> 
> Not to worry, no offense taken. :-)

Good.

> > I realise this is a work in progress, etc.
> 
> Indeed!  I'm currently puzzling out a re-design of the transform
> mechanism, since the original/current design proved that it wasn't up
> to snuff.  Also, there are many small things that crop up.  For
> example, I'm considering renaming "options" to "settings".  (Effective
> naming is very important to me. It helps me keep the design in my
> head.)
> 
> Please consider the entire codebase and design to be extremely
> experimental, subject to and welcoming of change.  Improvements are
> accepted in any form from any source.  I try to distance myself from
> the existing code and design, to objectively judge it against
> alternatives (not always easy).
> 
> If you'd like to join the project, just let me know.  That goes for
> anybody who's interested in helping with Docutils.

Maybe.  I'm not really at that point (yet?).

> > I hope I'll be able to supply some patches as well as just grousing.
> 
> That would be most welcome (on both counts ;-).
> 
> > I'm afraid the docstrings slightly suffer from the problem of only
> > making sense if you already understand what's going on.
> 
> I think it's hard for the developer (me) to objectively document their
> own code, since they *do* already understand it.  Contributions in
> this regard (questions, suggestions, actual docs and/or docstrings)
> would also be welcome!

OK.

> > I guess I'm missing the "big picture", and docstrings aren't really
> > the place to get that.
> 
> The only "big picture" can be found in PEPs 256 & 258, especially the
> latter.  They're kept up to date.

PEP 258 seems to be what I needed.

> > WHY can't I just pass a filename or a file-like object in as
> > destination?
> 
> I'm trying to provide a uniform interface, no matter the input source
> (string, single file, multiple files in directories, Python module,
> Python package) and output destination (string, single file, multiple
> files), not all of which are implemented yet.  The I/O classes store
> attributes of their data stores, such as paths and encodings; the I/O
> classes handle the text decoding & encoding.

I object (faintly :) to your invention of new classes for this.  Why
not use what's already there, i.e. the file interface?

If you want strings, use a StringIO.  The codecs.Stream{Reader,Writer}
classes handle encodings.  Etc.

> Having said that, if there's a simpler way then I'm all ears.

See above, maybe.

> > Having to wrap things up in a layer of library specific classes rubs
> > me up the wrong way.
> 
> Does the above justify it to you now?  If not, I'm open to
> suggestions.  Although I think the I/O classes are a decent solution,
> I'm not 100% sure they don't smell bad.  Sometimes it's hard to tell
> until you've seen a better solution.

It would be nice if you didn't *have* to learn a new set of classes to
use docutils.  The mantra "easy things should be easy, difficult
things should be possible" is one I'm quite attached to.

> >> Just replace "pub.set_options()" with::
> >> 
> >>     options = pub.set_options()
> > 
> > Ah!  OK.
> > 
> >> "Publisher.set_options()" sets *and* returns the option values
> >> object.
> > 
> > This is not an intuitive interface, to my mind.
> 
> Agreed.  Returning the value was an afterthought.  How about this
> instead? ::
> 
>     pub.set_options()
>     options = pub.options

That would be better, I think.  Seems a bit more regular.

> >> You'll need to pass the options object to the I/O instantiators.
> >> This code should work (assuming you want a string return value)::
> >> 
> >>     from docutils.core import Publisher
> >>     from docutils.io import FileInput, StringOutput
> >> 
> >>     pub = Publisher()
> >>     options = pub.set_options()
> >>     pub.source = FileInput(options, source_path=filename)
> >>     pub.destination = StringOutput(options)
> >>     pub.set_reader('standalone', None, 'restructuredtext')
> >>     pub.set_writer('html')
> >>     output = pub.publish()
> > 
> > Thanks, I'll give it a try momentarily.
> 
> That code won't work.  As I wrote to Aahz, the "pub.set_reader" and
> "pub.set_writer" calls have to come *before* "pub.set_options".  My
> mistake, sorry.

I've gotten this to work, in the end.  The code's on my laptop though,
so I can't post it just yet.

> > I know I'm going to have to get a bit more cosy with the library at
> > some point:
> 
> Out of curiosity, what will you be using Docutils for?

I was thinking of writing a blog tool.  I'm not sure another blog tool
is what the world most desparately needs right now, but all the ones I
can find annoy me in non-trivial ways (basically by being too
complicated -- I am *not* going to install MySQL just for a blog!).

> > I don't really want to go .rst file -> .html file, I want to go
> > chunk of restructured text -> chunk of html, and I'll want to know
> > what the heading of the chunk of restructured text actually was.
> 
> I don't understand what you mean by "the heading" here.  Can you
> explain?

Well, I'm assuming that each blog entry will start with a heading.
line.  That's all.  It seems docutils already knows about this sort of
thing, as it ends up in the <title> tag...

> > I have a couple of train journeys this weekend; I'll see what I can
> > come up with.
> 
> Cool.

Unfortunately <wink> most of my trains journeys were spent playing ev
nova...

Cheers,
M.

-- 
  I have a cat, so I know that when she digs her very sharp claws into
  my chest or stomach it's really a sign of affection, but I don't see
  any reason for programming languages to show affection with pain.
                                        -- Erik Naggum, comp.lang.lisp


From goodger@users.sourceforge.net  Wed Oct 16 00:04:44 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 15 Oct 2002 19:04:44 -0400
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: <2mu1jpnubz.fsf@starship.python.net>
Message-ID: <B9D2174A.2A61E%goodger@users.sourceforge.net>

Michael Hudson wrote:
>>> WHY can't I just pass a filename or a file-like object in as
>>> destination?
>> 
>> I'm trying to provide a uniform interface, no matter the input
>> source (string, single file, multiple files in directories, Python
>> module, Python package) and output destination (string, single
>> file, multiple files), not all of which are implemented yet.  The
>> I/O classes store attributes of their data stores, such as paths
>> and encodings; the I/O classes handle the text decoding & encoding.
> 
> I object (faintly :) to your invention of new classes for this.  Why
> not use what's already there, i.e. the file interface?
> 
> If you want strings, use a StringIO.  The codecs.Stream{Reader,Writer}
> classes handle encodings.  Etc.
...
> It would be nice if you didn't *have* to learn a new set of classes
> to use docutils.  The mantra "easy things should be easy, difficult
> things should be possible" is one I'm quite attached to.

I can see your point, and I agree with what you say in the abstract.
The I/O classes may be a case of adding functionality before it's
required.  However, I'm currently thinking about adding *more*
functionality to these classes.  When you combine all the features
Docutils needs from its I/O, I'm not convinced that doing it piecemeal
is better:

* Multiple input sources: single files, directory trees, Python
  packages, strings.

* Multiple output destinations: single files, directory trees,
  strings.

* Transforms (& maybe command-line options) associated with the
  source/destination type.  For example, a "split a monolithic
  document tree into multiple doctrees" transform for the directory
  tree output destination.  This is something we're currently
  discussing on the docutils-develop list.

* Encoding support.

The I/O classes are really just implementation details.  Perhaps they
wouldn't be objectionable if better convenience functions existed?
``docutils.core.publish()`` provides a dirt-simple interface for
file-to-file command-line processing (including stdin-to-stdout).
Would a ``publish_string`` convenience funcion (providing
string-to-string programmatic processing) appease you?  Given that,
you could do your own I/O.

Of course, I'm open to convincing arguments.  Or better yet,
convincing code. :-)

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From Paul.Moore@atosorigin.com  Wed Oct 16 09:11:47 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 16 Oct 2002 09:11:47 +0100
Subject: [Doc-SIG] simple use of docutils
Message-ID: <16E1010E4581B049ABC51D4975CEDB885E2D18@UKDCX001.uk.int.atosorigin.com>

From: David Goodger [mailto:goodger@users.sourceforge.net]
> When you combine all the features Docutils needs from its I/O,
> I'm not convinced that doing it piecemeal is better:
>=20
> * Multiple input sources: single files, directory trees, Python
>   packages, strings.
>=20
> * Multiple output destinations: single files, directory trees,
>   strings.
>=20
> * Transforms (& maybe command-line options) associated with the
>   source/destination type.  For example, a "split a monolithic
>   document tree into multiple doctrees" transform for the directory
>   tree output destination.  This is something we're currently
>   discussing on the docutils-develop list.
>=20
> * Encoding support.

Actually, I can imagine applications where I'd like all these features =
in general file I/O. Instead of writing docutils-specific classes, a =
better approach would be to write file-object wrappers.

Example::

    from fileutils import tee, transform

    f =3D open("myoutput", "w")
    f =3D tee(open("outputcopy", "w"), f)
    f =3D transform(my_function, f)

    # Now just write to f...

You could write some great pipeline-style code in there...

I think this is somewhere where restricting the code to docutils is a =
mistake. (For an example of a similar situation, distutils has a =
fancy_getopt() which is almost unknown outside of that package - this is =
in spite of the fact that it is just a getopt utility, with nothing =
distutils-specific about it...)

If there are good utilities that docutils needs, but which are not =
docutils-specific, let's package them independently so that the next =
person doesn't reinvent the wheel.

Paul.


From goodger@users.sourceforge.net  Thu Oct 17 02:43:56 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 16 Oct 2002 21:43:56 -0400
Subject: [Doc-SIG] fileutils (was Re: simple use of docutils)
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB885E2D18@UKDCX001.uk.int.atosorigin.com>
Message-ID: <B9D38E1B.2A833%goodger@users.sourceforge.net>

Moore, Paul wrote:
> Actually, I can imagine applications where I'd like all these
> features in general file I/O. Instead of writing docutils-specific
> classes, a better approach would be to write file-object wrappers.
...
> You could write some great pipeline-style code in there...
> 
> I think this is somewhere where restricting the code to docutils is
> a mistake.

The code is currently only used by Docutils, but it's certainly not
*restricted* in any way.  The code is as open as open can be, just
waiting for a champion to run with it.  Will you be that champion? :-)
Is a "fileutils" project in the cards?

> If there are good utilities that docutils needs, but which are not
> docutils-specific, let's package them independently so that the next
> person doesn't reinvent the wheel.

Beware over-generalization though.  Can these file-wrapper utilities
satisfy all (or the great majority) of uses?  Or will developers have
to roll their own anyhow?  I suspect the latter, because requirements
vary so widely.  And it's so easy to roll your own in Python.

I'm sure there are other parts of Docutils that could be extracted and
repurposed for general use.  The statemachine.py module was intended
in this way (in fact, it was a rewrite of an older module I'd written
for general text filtering, so I *know* it's independently useful).
The test/package_unittest.py module's "loadTestModules" function
extends the standard unittest.py to load and run directories full of
test modules; it should be offered back to Python in general.  There
are probably other examples, but I'm too close to the code to notice.

Any volunteers?

> (For an example of a similar situation, distutils has a
> fancy_getopt() which is almost unknown outside of that package -
> this is in spite of the fact that it is just a getopt utility, with
> nothing distutils-specific about it...)

Note that fancy_getopt() has been superceded by Optik
(http://optik.sf.net/), which will be joining the standard library for
Python 2.3, probably as OptionParser.py.  Note also that Docutils is
already using it in the optik.py module (which will be replaced by the
official module once it exists; Optik is currently a package).

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From mwh@python.net  Thu Oct 17 11:30:17 2002
From: mwh@python.net (Michael Hudson)
Date: 17 Oct 2002 11:30:17 +0100
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: David Goodger's message of "Tue, 15 Oct 2002 19:04:44 -0400"
References: <B9D2174A.2A61E%goodger@users.sourceforge.net>
Message-ID: <2m4rbl1hom.fsf@starship.python.net>

David Goodger <goodger@users.sourceforge.net> writes:

> Michael Hudson wrote:
> >>> WHY can't I just pass a filename or a file-like object in as
> >>> destination?
> >> 
> >> I'm trying to provide a uniform interface, no matter the input
> >> source (string, single file, multiple files in directories, Python
> >> module, Python package) and output destination (string, single
> >> file, multiple files), not all of which are implemented yet.  The
> >> I/O classes store attributes of their data stores, such as paths
> >> and encodings; the I/O classes handle the text decoding & encoding.
> > 
> > I object (faintly :) to your invention of new classes for this.  Why
> > not use what's already there, i.e. the file interface?
> > 
> > If you want strings, use a StringIO.  The codecs.Stream{Reader,Writer}
> > classes handle encodings.  Etc.
> ...
> > It would be nice if you didn't *have* to learn a new set of classes
> > to use docutils.  The mantra "easy things should be easy, difficult
> > things should be possible" is one I'm quite attached to.
> 
> I can see your point, and I agree with what you say in the abstract.
> The I/O classes may be a case of adding functionality before it's
> required.  However, I'm currently thinking about adding *more*
> functionality to these classes.  When you combine all the features
> Docutils needs from its I/O, I'm not convinced that doing it piecemeal
> is better:
> 
> * Multiple input sources: single files, directory trees, Python
>   packages, strings.

What's the interface going to be for these?  Here's a suggestion: a
file-like object or something you can iterate over to get file-like
objects.  That covers all the above rather easily.

> * Multiple output destinations: single files, directory trees,
>   strings.

single files & strings are easily handled by file-like objects.  What
interface do you suggest for outputting to multiple files?

> * Transforms (& maybe command-line options) associated with the
>   source/destination type.  For example, a "split a monolithic
>   document tree into multiple doctrees" transform for the directory
>   tree output destination.  This is something we're currently
>   discussing on the docutils-develop list.

Well, I don't know what this means, so I can't really comment.

Sigh, another mailing list to join...

> * Encoding support.

This is what codes.Stream{Reader,Writer} are for.

> The I/O classes are really just implementation details.  Perhaps they
> wouldn't be objectionable if better convenience functions existed?

Perhaps.

> ``docutils.core.publish()`` provides a dirt-simple interface for
> file-to-file command-line processing (including stdin-to-stdout).
> Would a ``publish_string`` convenience funcion (providing
> string-to-string programmatic processing) appease you?  Given that,
> you could do your own I/O.

True.

> Of course, I'm open to convincing arguments.  Or better yet,
> convincing code. :-)

Still no progress on this front, I'm afraid.

Cheers,
M.

-- 
  US elections
  For those of you fearing that the rest of the world might be 
  making fun of the US because of this: Rest assured, we are.
         -- http://www.advogato.org/person/jameson/diary.html?start=12


From neal@metaslash.com  Fri Oct 18 01:28:32 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 17 Oct 2002 20:28:32 -0400
Subject: [Doc-SIG] broken \ref links
Message-ID: <20021018002832.GA21485@epoch.metaslash.com>

The second oldest bug on SF is:  http://python.org/sf/217195

\ref links are broken when there are multiple \refs on the same line.
The problem seems to be in Doc/tools/node2label.pl around lines 47-57.

I really don't know perl.  I'm afraid to learn, :-) otherwise I'd
suggest a fix.  If someone has suggestions though, I will try them.

Neal


From neal@metaslash.com  Fri Oct 18 01:44:40 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 17 Oct 2002 20:44:40 -0400
Subject: [Doc-SIG] More broken \ref links
Message-ID: <20021018004440.GB21485@epoch.metaslash.com>

I guessed by replacing chop() with chomp() and this seemed to work.
chop() removes the last char while chomp() removes whitespace.
Is that correct?  Is removing whitespace what was desired?

clueless-ly y'rs,
Neal


From fdrake@acm.org  Fri Oct 18 02:03:10 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 17 Oct 2002 21:03:10 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <20021018004440.GB21485@epoch.metaslash.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
Message-ID: <15791.24014.651944.165446@grendel.zope.com>

Neal Norwitz writes:
 > I guessed by replacing chop() with chomp() and this seemed to work.
 > chop() removes the last char while chomp() removes whitespace.
 > Is that correct?  Is removing whitespace what was desired?

Interesting!  I don't think we're trying to remove arbitrary
whitespace there, so switching to chomp() may be just the ticket.
(Jeremy wrote the original version of that script, and only a real
Perl programmer would be able to decipher it now.)

If you don't see any ill effects, go ahead and check in the change,
and I'll try and take a closer look at the result tomorrow -- no more
time tonight; sorry.

Thanks for taking some time to look at the doc issues!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From trentm@ActiveState.com  Fri Oct 18 03:08:04 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Thu, 17 Oct 2002 19:08:04 -0700
Subject: [Doc-SIG] broken \ref links
In-Reply-To: <20021018002832.GA21485@epoch.metaslash.com>; from neal@metaslash.com on Thu, Oct 17, 2002 at 08:28:32PM -0400
References: <20021018002832.GA21485@epoch.metaslash.com>
Message-ID: <20021017190804.A9473@ActiveState.com>

[Neal Norwitz wrote]
> The second oldest bug on SF is:  http://python.org/sf/217195
> 
> \ref links are broken when there are multiple \refs on the same line.
> The problem seems to be in Doc/tools/node2label.pl around lines 47-57.
> 
> I really don't know perl.  I'm afraid to learn, :-) otherwise I'd
> suggest a fix.  If someone has suggestions though, I will try them.

... I might eventually have gotten there, but then I saw that Neal found
    the problem (use chomp() instead of chop()). I can verify that that
    is the problem. Read on if you want to see why.


I'll give a quick try by documenting the code in question:

    while (<>) {

# This while loop runs once for each input line (where the input files
# are '*.html' as called by Doc/tools/mkhowto.

      # don't want to do one s/// per line per node
      # so look for lines with hrefs, then do s/// on nodes present
      if (/(HREF|href)=[\"\']node\d+\.html[\#\"\']/) {

# The current line ($_) being processed has one or more HREF="..."
# strings in it. The line mentioned in the bug (from my Python 2.2 doc
# build is:
#  '<A HREF="node87.html#try">7.4</A> and <tt class="keyword">raise</tt> statement in section <A href="node77.html#raise">6.9</A>.\n'
#

        @parts = split(/(HREF|href)\=[\"\']/);

# to use Python list syntax:
# parts = ['<A ', 'HREF',
#          'node87.html#try">7.4</A> and <tt class="keyword">raise</tt> statement in',
#          'section <A ', 'href', 'node77.html#raise">6.9</A>.\n']

        shift @parts;

# parts = ['HREF',
#          'node87.html#try">7.4</A> and <tt class="keyword">raise</tt> statement in',
#          'section <A ', 'href', 'node77.html#raise">6.9</A>.\n']

        for $node (@parts) {

# One pass for each element ($node) of parts.

          $node =~ s/[\#\"\'].*$//g;

# After this:
#   node = 'HREF'
#   node = 'node87.html'
#   node = 'href'
#   node = 'node77.html\n'

          chop($node);    # Neal was right, the bug is here. (See WRONG
                          # below)

# Just want the foo.html part (strip newlines and anything from " or '
# or # on.
#   node = 'HRE'
#   node = 'node87.htm'   <---- WRONG
#   node = 'hre'
#   node = 'node77.html'

          if (defined($nodes{$node})) {
            $label = $nodes{$node};

# If 'node' is in the nodes dictionary, which is built from labels.pl,
# which in my build will result in:
#   nodes = {
#       'node87.html' : 'try',
#       'node77.html' : 'raise',
#       ...
#   }
# and because "node87.html" was mangled by chop() this lookup fails.

            if (s/(HREF|href)=([\"\'])$node([\#\"\'])/href=$2$label.html$3/g) {
              s/(HREF|href)=([\"\'])$label.html/href=$2$label.html/g;
              $newnames{$node} = "$label.html";
            }
          }
        }
      }
      print;
    }


Cheers,
Trent


-- 
Trent Mick
TrentM@ActiveState.com


From neal@metaslash.com  Fri Oct 18 03:14:02 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 17 Oct 2002 22:14:02 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <15791.24014.651944.165446@grendel.zope.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
 <15791.24014.651944.165446@grendel.zope.com>
Message-ID: <20021018021402.GC21485@epoch.metaslash.com>

On Thu, Oct 17, 2002 at 09:03:10PM -0400, Fred L. Drake, Jr. wrote:
> 
> Neal Norwitz writes:
>  > I guessed by replacing chop() with chomp() and this seemed to work.
>  > chop() removes the last char while chomp() removes whitespace.
>  > Is that correct?  Is removing whitespace what was desired?
> 
> Interesting!  I don't think we're trying to remove arbitrary
> whitespace there, so switching to chomp() may be just the ticket.
> (Jeremy wrote the original version of that script, and only a real
> Perl programmer would be able to decipher it now.)
> 
> If you don't see any ill effects, go ahead and check in the change,
> and I'll try and take a closer look at the result tomorrow -- no more
> time tonight; sorry.

I made the docs before changing node2label, saved the output of lib and ref,
then re-made after the change.  The only differences appeared to be correct.

Note:  I had a problem building the tutorial both before and after the change:

*** Session transcript and error messages are in Doc/html/tut/tut.how.

        (texinputs/boilerplate.tex) (tut.aux)
        Runaway argument?
        {\contentsline {chapter}{\numberline {9}Cl
        ! File ended while scanning use of \@writefile

> Thanks for taking some time to look at the doc issues!

Somebody ought to help you from time to time. :-)

If this change fixes the problem, can you close the bug.
Also we need to backport the change to 2.2.

Neal


From goodger@users.sourceforge.net  Fri Oct 18 06:22:24 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 18 Oct 2002 01:22:24 -0400
Subject: [Doc-SIG] Updates to Docutils
Message-ID: <B9D512CF.2A92C%goodger@users.sourceforge.net>

* Improved vertical whitespace issues with HTML output.  This affects
  first and last elements in containers, and removes unnecessary top
  and bottom margins (resp.).  For example, a literal block at the top
  of a table cell won't have a ridiculous top margin.

* Changes to API details that affects client code:

  - Renamed transform method "transform" to "apply".

  - Renamed "options" to "settings" for runtime settings (as set by
    command-line options).  Sometimes "option" (singular) became
    "settings" (plural).  Some variations below:

    - document.options -> document.settings (stored in other objects
      as well)
    - option_spec -> settings_spec (not directives though)
    - OptionSpec -> SettingsSpec
    - cmdline_options -> settings_spec
    - relative_path_options -> relative_path_settings
    - option_default_overrides -> settings_default_overrides
    - core.Publisher.set_options -> core.Publisher.get_settings

  - Renamed core.publish() to core.publish_cmdline(), and added
    placeholders for new publish_string() and publish_file()
    convenience functions.

  Please make corresponding changes to client code.  I'll be happy to
  help if it's non-trivial.  I've updated many of the modules in the
  sandbox, but I may not have changed everything (or I may have
  changed too much).  Individual authors, please test the updated
  code.  Thanks.

  Gunnar, I wasn't able to update DocFactory.  It seemed to be using
  "options" in different ways, doing its own config file parsing, and
  I didn't know which "options" to change to "settings".

* Some bug fixes.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From fdrake@acm.org  Fri Oct 18 16:59:02 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 18 Oct 2002 11:59:02 -0400
Subject: [Doc-SIG] broken \ref links
In-Reply-To: <20021017190804.A9473@ActiveState.com>
References: <20021018002832.GA21485@epoch.metaslash.com>
 <20021017190804.A9473@ActiveState.com>
Message-ID: <15792.12230.36749.530237@grendel.zope.com>

Trent Mick writes:
 > I'll give a quick try by documenting the code in question:

Thanks, this really helps!

 > # One pass for each element ($node) of parts.
 > 
 >           $node =~ s/[\#\"\'].*$//g;
 > 
 > # After this:
 > #   node = 'HREF'
 > #   node = 'node87.html'
 > #   node = 'href'
 > #   node = 'node77.html\n'
 > 
 >           chop($node);    # Neal was right, the bug is here. (See WRONG
 >                           # below)
 > 
 > # Just want the foo.html part (strip newlines and anything from " or '
 > # or # on.
 > #   node = 'HRE'
 > #   node = 'node87.htm'   <---- WRONG
 > #   node = 'hre'
 > #   node = 'node77.html'

This makes me think the first transform after the split is wrong;
should we just change that and drop the chomp() altogether?  So the
result would be:

    ...
    @parts = split(/(HREF|href)\=[\"\']/);
    shift @parts;
    for $node (@parts) {
      $node =~ s/[\#\"\'].*\n?//g;
      if (defined($nodes{$node})) {
	$label = $nodes{$node};
	if (s/(HREF|href)=([\"\'])$node([\#\"\'])/href=$2$label.html$3/g) {
	  s/(HREF|href)=([\"\'])$label.html/href=$2$label.html/g;
	  $newnames{$node} = "$label.html";
	}
      }
    }
    ...

I'll give this a try.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Fri Oct 18 17:02:36 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 18 Oct 2002 12:02:36 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <20021018021402.GC21485@epoch.metaslash.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
 <15791.24014.651944.165446@grendel.zope.com>
 <20021018021402.GC21485@epoch.metaslash.com>
Message-ID: <15792.12444.286320.744394@grendel.zope.com>

Neal Norwitz writes:
 > I made the docs before changing node2label, saved the output of lib and ref,
 > then re-made after the change.  The only differences appeared to be correct.

Looks good; I've done the backport and closed the related issues.
I'll bet you didn't know you'd fixed two of them before you saw the
checkins!  ;-)

 > Note:  I had a problem building the tutorial both before and after the change:
 > 
 > *** Session transcript and error messages are in Doc/html/tut/tut.how.
 > 
 >         (texinputs/boilerplate.tex) (tut.aux)
 >         Runaway argument?
 >         {\contentsline {chapter}{\numberline {9}Cl
 >         ! File ended while scanning use of \@writefile

I can't reproduce this.  What software are you using?  (Version
numbers and packages used for the LaTeX installation may be
particularly interesting; if you can email the entire tut.how file,
I'll take a look at that.)

 > Somebody ought to help you from time to time. :-)

I certainly appreciate it when it happens!  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From neal@metaslash.com  Fri Oct 18 17:11:49 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 18 Oct 2002 12:11:49 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <15792.12444.286320.744394@grendel.zope.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
 <15791.24014.651944.165446@grendel.zope.com>
 <20021018021402.GC21485@epoch.metaslash.com>
 <15792.12444.286320.744394@grendel.zope.com>
Message-ID: <20021018161149.GE21485@epoch.metaslash.com>

--Boundary_(ID_kei8NiJ6qc8PSlTiFowWvg)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT
Content-disposition: inline

On Fri, Oct 18, 2002 at 12:02:36PM -0400, Fred L. Drake, Jr. wrote:
> 
> Neal Norwitz writes:
>  > I made the docs before changing node2label, saved the output of lib and ref,
>  > then re-made after the change.  The only differences appeared to be correct.
> 
> Looks good; I've done the backport and closed the related issues.
> I'll bet you didn't know you'd fixed two of them before you saw the
> checkins!  ;-)

There's a bet you would win! :-)

>  > Note:  I had a problem building the tutorial both before and after the change:
>  > 
>  > *** Session transcript and error messages are in Doc/html/tut/tut.how.
>  > 
>  >         (texinputs/boilerplate.tex) (tut.aux)
>  >         Runaway argument?
>  >         {\contentsline {chapter}{\numberline {9}Cl
>  >         ! File ended while scanning use of \@writefile
> 
> I can't reproduce this.  What software are you using?  (Version
> numbers and packages used for the LaTeX installation may be
> particularly interesting; if you can email the entire tut.how file,
> I'll take a look at that.)

RedHat 7.2

[neal@epoch util]$ latex2html --version
This is LaTeX2HTML Version 2K.1beta (1.47)

[neal@epoch util]$ rpm -qa | grep -i latex
tetex-latex-1.0.7-30

tut.how is small, so I've attached it here.

Neal

--Boundary_(ID_kei8NiJ6qc8PSlTiFowWvg)
Content-type: text/plain; charset=us-ascii; NAME=tut.how
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=tut.how

+++ TEXINPUTS=/home/neal/build/python/dist/src/Doc/tut:/home/neal/build/python/dist/src/Doc/paper-letter:/home/neal/build/python/dist/src/Doc/texinputs:
+++ latex tut
This is TeX, Version 3.14159 (Web2C 7.3.1)
(/home/neal/build/python/dist/src/Doc/tut/tut.tex
LaTeX2e <2000/06/01>
Babel <v3.7h> and hyphenation patterns for american, french, german, ngerman, i
talian, nohyphenation, loaded.
(/home/neal/build/python/dist/src/Doc/texinputs/manual.cls
Document Class: manual 1998/03/03 Document class (Python manual)
(/home/neal/build/python/dist/src/Doc/texinputs/pypaper.sty
(/usr/share/texmf/tex/latex/psnfss/times.sty)
Using Times instead of Computer Modern.
) (/usr/share/texmf/tex/latex/base/report.cls
Document Class: report 2000/05/19 v1.4b Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/size10.clo))
(/home/neal/build/python/dist/src/Doc/texinputs/fancyhdr.sty)
Using fancier footers than usual.
(/home/neal/build/python/dist/src/Doc/texinputs/fncychap.sty)
Using fancy chapter headings.
(/home/neal/build/python/dist/src/Doc/texinputs/python.sty
(/usr/share/texmf/tex/latex/tools/longtable.sty)
(/usr/share/texmf/tex/latex/tools/verbatim.sty)
(/usr/share/texmf/tex/latex/base/alltt.sty)))
(/usr/share/texmf/tex/latex/base/fontenc.sty
(/usr/share/texmf/tex/latex/base/t1enc.def))
(/home/neal/build/python/dist/src/Doc/texinputs/boilerplate.tex) (tut.aux)
Runaway argument?
{\contentsline {chapter}{\numberline {9}Cl 
! File ended while scanning use of \@writefile.
<inserted text> 
                \par 
l.14 \begin{document}
                     
? 
! Emergency stop.
<inserted text> 
                \par 
l.14 \begin{document}
                     
No pages of output.
Transcript written on tut.log.
*** Session transcript and error messages are in /home/neal/build/python/dist/src/Doc/html/tut/tut.how.
*** Exited with status 1.

--Boundary_(ID_kei8NiJ6qc8PSlTiFowWvg)--


From fdrake@acm.org  Fri Oct 18 17:14:55 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 18 Oct 2002 12:14:55 -0400
Subject: [Doc-SIG] broken \ref links
In-Reply-To: <15792.12230.36749.530237@grendel.zope.com>
References: <20021018002832.GA21485@epoch.metaslash.com>
 <20021017190804.A9473@ActiveState.com>
 <15792.12230.36749.530237@grendel.zope.com>
Message-ID: <15792.13183.805268.882266@grendel.zope.com>

Fred L. Drake, Jr. writes:
 > This makes me think the first transform after the split is wrong;
 > should we just change that and drop the chomp() altogether?  So the
...
 > I'll give this a try.

This seems to work fine.  All the links Neal fixed still work, and
webchecker is still happy with the results.

Trent, is there any reason to prefer one solution over the other?
(Like one being faster?)

Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Fri Oct 18 17:47:34 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 18 Oct 2002 12:47:34 -0400
Subject: [Doc-SIG] broken \ref links
In-Reply-To: <20021018094213.A16434@ActiveState.com>
References: <20021018002832.GA21485@epoch.metaslash.com>
 <20021017190804.A9473@ActiveState.com>
 <15792.12230.36749.530237@grendel.zope.com>
 <15792.13183.805268.882266@grendel.zope.com>
 <20021018094213.A16434@ActiveState.com>
Message-ID: <15792.15142.321079.227516@grendel.zope.com>

Trent Mick writes:
 > Speed is not really an issue here. The node2label.pl runs are *much*
 > shorter than the associated latex2html runs. I would be inclined to just

Speed is an issue when I'm building a documentation release for the
third time and I'm getting tired of waiting.  Every microsecond
counts!  ;-)

 > use the s/chop/chomp/ fix because there is less new and potentially
 > surprising about it. However, I can't think of anything that would break
 > either fix.

Ok, so I'll leave it alone.  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From trentm@ActiveState.com  Fri Oct 18 17:42:13 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Fri, 18 Oct 2002 09:42:13 -0700
Subject: [Doc-SIG] broken \ref links
In-Reply-To: <15792.13183.805268.882266@grendel.zope.com>; from fdrake@acm.org on Fri, Oct 18, 2002 at 12:14:55PM -0400
References: <20021018002832.GA21485@epoch.metaslash.com> <20021017190804.A9473@ActiveState.com> <15792.12230.36749.530237@grendel.zope.com> <15792.13183.805268.882266@grendel.zope.com>
Message-ID: <20021018094213.A16434@ActiveState.com>

[Fred L . Drake wrote]
> 
> Fred L. Drake, Jr. writes:
>  > This makes me think the first transform after the split is wrong;
>  > should we just change that and drop the chomp() altogether?  So the
> ...
>  > I'll give this a try.
> 
> This seems to work fine.  All the links Neal fixed still work, and
> webchecker is still happy with the results.
> 
> Trent, is there any reason to prefer one solution over the other?
> (Like one being faster?)

Speed is not really an issue here. The node2label.pl runs are *much*
shorter than the associated latex2html runs. I would be inclined to just
use the s/chop/chomp/ fix because there is less new and potentially
surprising about it. However, I can't think of anything that would break
either fix.


Trent

-- 
Trent Mick
TrentM@ActiveState.com


From goodger@users.sourceforge.net  Sat Oct 19 00:53:17 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 18 Oct 2002 19:53:17 -0400
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: <2m4rbl1hom.fsf@starship.python.net>
Message-ID: <B9D6172C.2A9ED%goodger@users.sourceforge.net>

Michael Hudson wrote:
>> * Multiple input sources: single files, directory trees, Python
>>   packages, strings.
> 
> What's the interface going to be for these?  Here's a suggestion: a
> file-like object or something you can iterate over to get file-like
> objects.  That covers all the above rather easily.

I'm not sure what the interface will be; haven't got there yet.
Suggestion noted.

>> * Multiple output destinations: single files, directory trees,
>>   strings.
> 
> single files & strings are easily handled by file-like objects.
> What interface do you suggest for outputting to multiple files?

Don't know yet. :-)

>> * Transforms (& maybe command-line options) associated with the
>>   source/destination type.  For example, a "split a monolithic
>>   document tree into multiple doctrees" transform for the directory
>>   tree output destination.  This is something we're currently
>>   discussing on the docutils-develop list.
> 
> Well, I don't know what this means, so I can't really comment.

Overview at http://docutils.sf.net/spec/pep-0258.html#transforms

> Sigh, another mailing list to join...

:-)  Sign up for the docutils-develop@lists.sf.net mailing list at
http://lists.sourceforge.net/lists/listinfo/docutils-users .

>> * Encoding support.
> 
> This is what codes.Stream{Reader,Writer} are for.

I've had enough trouble grokking Unicode encodings.  The docs for the
codecs module are rather opaque and beyond my patience at the moment.
Do you know of a gentle introduction?

Having gone through the pain of figuring it all out, I'd rather
relieve the client code (and thus, Docutils developers) from the
responsibility of handling Unicode decoding & encoding (unless it/they
want to, of course!).  So it's handled by the I/O classes.  If it's
better to do the handling with codecs.StreamReader/.StreamWriter,
that's an implementation detail.  I'm comfortable with the current
set-up; if you're not: patches welcome!

>> The I/O classes are really just implementation details.  Perhaps they
>> wouldn't be objectionable if better convenience functions existed?
> 
> Perhaps.
> 
>> ``docutils.core.publish()`` provides a dirt-simple interface for
>> file-to-file command-line processing (including stdin-to-stdout).
>> Would a ``publish_string`` convenience funcion (providing
>> string-to-string programmatic processing) appease you?  Given that,
>> you could do your own I/O.
> 
> True.

I've added convenience functions to docutils.core, complete with
documentation: "publish_file()" for file-like object I/O (example in
tools/buildhtml.py) and "publish_string()" for string I/O (example in
tools/pep2html.py).  Please take a look.  I've also renamed the old
"publish()" function to "publish_cmdline()" to emphasize its
relationship to command-line front-end tools.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From bert@chello.at  Sat Oct 19 16:47:35 2002
From: bert@chello.at (engelbert gruber)
Date: Sat, 19 Oct 2002 17:47:35 +0200 (CEST)
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <20021018004440.GB21485@epoch.metaslash.com>
Message-ID: <Pine.LNX.4.33.0210191742260.2707-100000@chello213047234135.telekabel.at>

On Thu, 17 Oct 2002, Neal Norwitz wrote:

> I guessed by replacing chop() with chomp() and this seemed to work.
> chop() removes the last char while chomp() removes whitespace.
> Is that correct?  Is removing whitespace what was desired?

chop removes the last character of a string and returns it.
  it is mor efficient than s/// because it does not scan.

chomp removes any trailing string that corresponds to the current
   $INPUT_RECORD_SEPARATOR, cr lf or both.

when in doubt about removing dos,unix or mac eol i use something like
   $ln =~ s/[\x0a\x0d]*$//;
but this fails the save every microsecond requirement, as does chomp.


--
engelbert gruber
email: engelbert.gruber@ssg.co.at


From fdrake@acm.org  Mon Oct 21 18:38:21 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 21 Oct 2002 13:38:21 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <20021018161149.GE21485@epoch.metaslash.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
 <15791.24014.651944.165446@grendel.zope.com>
 <20021018021402.GC21485@epoch.metaslash.com>
 <15792.12444.286320.744394@grendel.zope.com>
 <20021018161149.GE21485@epoch.metaslash.com>
Message-ID: <15796.15245.95363.453320@grendel.zope.com>

Neal Norwitz writes:
 > (/home/neal/build/python/dist/src/Doc/texinputs/boilerplate.tex) (tut.aux)
 > Runaway argument?
 > {\contentsline {chapter}{\numberline {9}Cl 
 > ! File ended while scanning use of \@writefile.
 > <inserted text> 
 >                 \par 
 > l.14 \begin{document}
 >                      
 > ? 
 > ! Emergency stop.
 > <inserted text> 
 >                 \par 
 > l.14 \begin{document}
 >                      
 > No pages of output.
 > Transcript written on tut.log.
 > *** Session transcript and error messages are in /home/neal/build/python/dist/src/Doc/html/tut/tut.how.
 > *** Exited with status 1.

Very strange; did you have LaTeX run that you interrupted with ^C or
similar?  If there's a tut.toc that ends with that "Cl" shown in the
message (which I think could be truncated in presentation), then
there's definately a damaged intermediate file.  I get a slightly
different error (though very similar) if I create a tut.toc that
exhibits that symptom directly.

If there is a tut.toc file present, try removing it and running
mkhowto again.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From neal@metaslash.com  Mon Oct 21 23:23:02 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 21 Oct 2002 18:23:02 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <15796.15245.95363.453320@grendel.zope.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
 <15791.24014.651944.165446@grendel.zope.com>
 <20021018021402.GC21485@epoch.metaslash.com>
 <15792.12444.286320.744394@grendel.zope.com>
 <20021018161149.GE21485@epoch.metaslash.com>
 <15796.15245.95363.453320@grendel.zope.com>
Message-ID: <20021021222302.GI21485@epoch.metaslash.com>

On Mon, Oct 21, 2002 at 01:38:21PM -0400, Fred L. Drake, Jr. wrote:
> 
> Neal Norwitz writes:
>  > (/home/neal/build/python/dist/src/Doc/texinputs/boilerplate.tex) (tut.aux)
>  > Runaway argument?
>  > {\contentsline {chapter}{\numberline {9}Cl 
>  > ! File ended while scanning use of \@writefile.
>  > <inserted text> 
>  >                 \par 
>  > l.14 \begin{document}
>  >                      
>  > ? 
>  > ! Emergency stop.
>  > <inserted text> 
>  >                 \par 
>  > l.14 \begin{document}
>  >                      
>  > No pages of output.
>  > Transcript written on tut.log.
>  > *** Session transcript and error messages are in /home/neal/build/python/dist/src/Doc/html/tut/tut.how.
>  > *** Exited with status 1.
> 
> Very strange; did you have LaTeX run that you interrupted with ^C or
> similar?  If there's a tut.toc that ends with that "Cl" shown in the
> message (which I think could be truncated in presentation), then
> there's definately a damaged intermediate file.  I get a slightly
> different error (though very similar) if I create a tut.toc that
> exhibits that symptom directly.

This appears to be the case.

> If there is a tut.toc file present, try removing it and running
> mkhowto again.

I did a 'make clean ; make' and everything looks fine now.

Thanks!

Neal


From fdrake@acm.org  Tue Oct 22 16:32:27 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 22 Oct 2002 11:32:27 -0400
Subject: [Doc-SIG] More broken \ref links
In-Reply-To: <20021021222302.GI21485@epoch.metaslash.com>
References: <20021018004440.GB21485@epoch.metaslash.com>
 <15791.24014.651944.165446@grendel.zope.com>
 <20021018021402.GC21485@epoch.metaslash.com>
 <15792.12444.286320.744394@grendel.zope.com>
 <20021018161149.GE21485@epoch.metaslash.com>
 <15796.15245.95363.453320@grendel.zope.com>
 <20021021222302.GI21485@epoch.metaslash.com>
Message-ID: <15797.28555.100218.154116@grendel.zope.com>

Neal Norwitz writes:
 > This appears to be the case.
...
 > I did a 'make clean ; make' and everything looks fine now.

Glad I could help!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mwh@python.net  Wed Oct 23 13:09:54 2002
From: mwh@python.net (Michael Hudson)
Date: 23 Oct 2002 13:09:54 +0100
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: David Goodger's message of "Fri, 18 Oct 2002 19:53:17 -0400"
References: <B9D6172C.2A9ED%goodger@users.sourceforge.net>
Message-ID: <2mptu1icfh.fsf@starship.python.net>

David Goodger <goodger@users.sourceforge.net> writes:

> ...

I (still) haven't had time to fiddle with the library front end, but
now I've had a chance to play with the backend, I'm impressed.  I
managed to adapt the html4css1 writer in what I thought would be
complicated ways very easily.  Nice one!

I might package up the code I have -- maybe an example? -- but at the
moment it's embarrassing.

Cheers,
M.

-- 
  MARVIN:  What a depressingly stupid machine.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 7


From aahz@pythoncraft.com  Wed Oct 23 13:59:36 2002
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 23 Oct 2002 08:59:36 -0400
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: <2mptu1icfh.fsf@starship.python.net>
References: <B9D6172C.2A9ED%goodger@users.sourceforge.net> <2mptu1icfh.fsf@starship.python.net>
Message-ID: <20021023125936.GA13309@panix.com>

On Wed, Oct 23, 2002, Michael Hudson wrote:
>
> I might package up the code I have -- maybe an example? -- but at the
> moment it's embarrassing.

Just go ahead and stick it in the sandbox after David gives you commit
privileges -- I did, and my code is worse than embarrassing.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From goodger@users.sourceforge.net  Thu Oct 24 02:46:01 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 23 Oct 2002 21:46:01 -0400
Subject: [Doc-SIG] Docutils transforms overhauled
Message-ID: <B9DCC917.2AD04%goodger@users.sourceforge.net>

I just checked in a major change to the Docutils tranform handling
mechanism.  There's a new component, called "Transformer", which
centralizes and simplifies the transform handling that used to be
distributed between Reader and Writer objects.  I brought PEP 258 up
to date with this change and other recent changes.  Transformer
details can be found here:

    http://docutils.sf.net/spec/pep-0258.html#transformer
    http://docutils.sf.net/spec/transforms.html

The change simplifies the project model considerably::

                     +---------------------------+
                     |        Docutils:          |
                     | docutils.core.Publisher,  |
                     | docutils.core.publish_*() |
                     +---------------------------+
                      /            |            \
                     /             |             \
            1,3,5   /        6     |              \ 7
           +--------+       +-------------+       +--------+
           | READER | ----> | TRANSFORMER | ====> | WRITER |
           +--------+       +-------------+       +--------+
            /     \\                                  |
           /       \\                                 |
     2    /      4  \\                             8  |
    +-------+   +--------+                        +--------+
    | INPUT |   | PARSER |                        | OUTPUT |
    +-------+   +--------+                        +--------+

This change is internal and shouldn't have an impact on front ends or
client code.  If it breaks any code, please let me know.

The latest snapshot is always available from:

    http://docutils.sf.net/docutils-snapshot.tgz

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From goodger@users.sourceforge.net  Thu Oct 24 02:50:50 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 23 Oct 2002 21:50:50 -0400
Subject: [Doc-SIG] simple use of docutils
In-Reply-To: <2mptu1icfh.fsf@starship.python.net>
Message-ID: <B9DCCA39.2AD06%goodger@users.sourceforge.net>

Michael Hudson wrote:
> I (still) haven't had time to fiddle with the library front end, but
> now I've had a chance to play with the backend, I'm impressed.  I
> managed to adapt the html4css1 writer in what I thought would be
> complicated ways very easily.  Nice one!

Glad you like it. ;-)

> I might package up the code I have -- maybe an example? -- but at the
> moment it's embarrassing.

That's what the sandbox is for!

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From nobody@maui.dnsvault.com  Wed Oct 30 00:36:18 2002
From: nobody@maui.dnsvault.com (Nobody)
Date: Tue, 29 Oct 2002 19:36:18 -0500
Subject: [Doc-SIG] A larger gold balance!
Message-ID: <E186gqI-000334-00@maui.dnsvault.com>

<html>
<head>
	<title>Untitled</title>
</head>

<body>
<h1 align="center"><font size="+3"><font color="red">Hello all E-Gold and EVOcash account holders!</font></font></h1>
<P align=justify><b>&nbsp;&nbsp; We here at Your Gold Chance 
are honest, reliable and willing to help you achieve what you want most..... A 
larger e-gold balance! We have developed a program to give you a smart return 
without the usual wait of hyip's. For speed and convenience we utilize online 
digital currencies. We accept E-Gold and EVOcash.</b></P><B>
<P align=justify>&nbsp;&nbsp;&nbsp; All investments are based on a 33 week plan. We pay you 10% per week for a total of 33 weeks. You first payment starts 7 days after you invest and your capital is returned to you at the end of the 33th week.</P>
<P align=justify>&nbsp;&nbsp;&nbsp;The minimum Investment is $25.00USD and the maximum Investment is $25,000 USD. Do not send us any amount outside these parameters otherwise your investment will be returned without profit.<br></P></B>
<p align="left"><font size="+1"><font color="Red">If you are interested in this program visit site:</font></font></p>
<a href="http://www.yourgoldchance.net/invest.html"><font color="Blue">http://www.yourgoldchance.net/invest.html</font></a>
<p align="left"><font size="+1"><font color="Red">All additional  information you can get here:</font></font></p>
<a href="http://www.yourgoldchance.net/"><font color="Blue">http://www.yourgoldchance.net/</font></a>

<p align="left"><font size="+1"><font color="Red">If you would like to contact us, please send us an email:</font></font></p>
<A href="mailto:yourgoldchance@yourgoldchance.net"><font color="Blue">yourgoldchance@yourgoldchance.net</font></A>
<p  align="justify"><b>Our company will continue its best efforts to make good on every
payments on our program without any delay.</b></p>

<p><b>Thanks for choosing Your Gold Chance!<br>
Karen Andreozzi,<br>
President<br>
Yor Gold Chance, Inc.</b></p>
        

</body>
</html>


From alt@artisan.com  Wed Oct 30 05:34:48 2002
From: alt@artisan.com (Albert Ting)
Date: Tue, 29 Oct 2002 21:34:48 -0800
Subject: [Doc-SIG] controlling sections levels
Message-ID: <15807.28536.556186.26709@lassen.artisan.com>

How does one control the number of html levels to output?  html.py defaults
to 6, but if I use the core.publish_string() function, I'm only getting 3.

Also, is there a way to auto number the titles/chapters/sections?  

In any case, this is a nice tool, I'm finding it more useful than the
Zope:StructuredText 

Thanks,
Albert Ting


From grubert@users.sourceforge.net  Wed Oct 30 08:38:53 2002
From: grubert@users.sourceforge.net (grubert@users.sourceforge.net)
Date: Wed, 30 Oct 2002 09:38:53 +0100 (CET)
Subject: [Doc-SIG] controlling sections levels
In-Reply-To: <15807.28536.556186.26709@lassen.artisan.com>
Message-ID: <Pine.LNX.4.33.0210300937580.20758-100000@b52.b.ssg.co.at>

On Tue, 29 Oct 2002, Albert Ting wrote:

>
> How does one control the number of html levels to output?  html.py defaul=
ts
> to 6, but if I use the core.publish_string() function, I'm only getting 3=
=2E
>
> Also, is there a way to auto number the titles/chapters/sections?

use

=2E. section-numbering::

>
> In any case, this is a nice tool, I'm finding it more useful than the
> Zope:StructuredText

--=20
 BINGO: =DCber diese Frage m=FCsste man genau nachdenken.
 --- Engelbert Gruber -------+
 SSG Fintl,Gruber,Lassnig   /
 A6410 Telfs Untermarkt 9  /
 Tel. ++43-5262-64727 ----+


From goodger@users.sourceforge.net  Wed Oct 30 13:19:13 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 30 Oct 2002 08:19:13 -0500
Subject: [Doc-SIG] controlling sections levels
In-Reply-To: <15807.28536.556186.26709@lassen.artisan.com>
Message-ID: <B9E54680.2AFA1%goodger@users.sourceforge.net>

Albert Ting wrote:
> How does one control the number of html levels to output?  html.py defaults
> to 6, but if I use the core.publish_string() function, I'm only getting 3.

It depends on how many levels of sections you have in your document.  You'll
need 6 section levels, plus document title (and maybe subtitle) to get all
the way to H6.  See http://docutils.sf.net/FAQ.html#html-writer .

> Also, is there a way to auto number the titles/chapters/sections?

See 
http://docutils.sf.net/spec/rst/directives.html#automatic-section-numbering

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From alt@artisan.com  Wed Oct 30 20:04:34 2002
From: alt@artisan.com (Albert Ting)
Date: Wed, 30 Oct 2002 12:04:34 -0800
Subject: [Doc-SIG] controlling sections levels
References: <15807.28536.556186.26709@lassen.artisan.com>
 <B9E54680.2AFA1%goodger@users.sourceforge.net>
Message-ID: <15808.15186.324167.572838@lassen.artisan.com>

> See 
> http://docutils.sf.net/spec/rst/directives.html#automatic-section-numbering

Thanks for the tidbits, what a great idea regarding these directives.
I also understand how to get the headers to work.  

But is there a way to specify the command line options in the directives as
well?  I'd like to turn off the stylesheets (or specify my own).  But since
I'm using core.publish_string(), I don't know how to set the command line
switches?

I'm currently pre-processing a file into a reStructuredText format, then
call core.publish_string().  There probably is a better way, by writing my
own reader, but not sure how Reader class is used.

Thanks,
Albert


From alt@artisan.com  Thu Oct 31 00:19:08 2002
From: alt@artisan.com (Albert Ting)
Date: Wed, 30 Oct 2002 16:19:08 -0800
Subject: [Doc-SIG] controlling sections levels
References: <15807.28536.556186.26709@lassen.artisan.com>
 <B9E54680.2AFA1%goodger@users.sourceforge.net>
 <15808.15186.324167.572838@lassen.artisan.com>
Message-ID: <15808.30460.921636.235689@lassen.artisan.com>

> But is there a way to specify the command line options in the directives as
> well?  I'd like to turn off the stylesheets (or specify my own).  But since
> I'm using core.publish_string(), I don't know how to set the command line
> switches?

Never mind.  I figured out I can specify my own values via the
settings_override param.  

Thanks,
Albert


From goodger@users.sourceforge.net  Thu Oct 31 02:52:15 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 30 Oct 2002 21:52:15 -0500
Subject: [Doc-SIG] controlling sections levels
In-Reply-To: <15808.15186.324167.572838@lassen.artisan.com>
Message-ID: <B9E6050E.2AFBC%goodger@users.sourceforge.net>

Albert Ting wrote:
> I'm currently pre-processing a file into a reStructuredText format,
> then call core.publish_string().  There probably is a better way, by
> writing my own reader, but not sure how Reader class is used.

If you're not trying to do anything fancy, core.publish_string()
should be fine.  It's when you need to do special processing that you
need a custom Reader; see
http://docutils.sf.net/spec/pep-0258.html#readers for an overview.
There have been some discussions of Reader classes on the
docutils-develop list lately
(http://lists.sf.net/lists/listinfo/docutils-develop).  If you
describe your goals, I can advise you if you need a Reader.

And FYI, all the Docutils runtime settings are listed in
http://docutils.sf.net/docs/tools.html#configuration-file-entries .

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/


From edloper@gradient.cis.upenn.edu  Thu Oct 31 03:01:38 2002
From: edloper@gradient.cis.upenn.edu (Edward Loper)
Date: Wed, 30 Oct 2002 22:01:38 -0500
Subject: [Doc-SIG] epydoc 1.1 release
Message-ID: <3DC09D12.9060804@gradient.cis.upenn.edu>

I just released version 1.1 of epydoc.  Epydoc is a tool for generating
API documentation for Python modules, based on their docstrings.

     <http://epydoc.sourceforge.net/>

A lightweight markup language called epytext can be used to format
docstrings, and to add information about specific fields, such as
parameters and instance variables.

For some examples of the documentation generated by epydoc, see:

   - The API documentation for epydoc.
     <http://epydoc.sourceforge.net/api/>

   - The API documentation for the Python 2.2 standard library.
     <http://epydoc.sourceforge.net/stdlib/>

   - The API documentation for NLTK, the natural langauge toolkit.
     <http://nltk.sourceforge.net/ref/>

New features added since 1.0 include:
   - A frames-based table of contents
   - Documentation for builtin objects
   - Documentation for types
   - Improved navigation bars
   - Improved warning messages
   - Better documentation for variables
   - An identifier index
   - An improved graphical interface
   - Man pages

A complete list of new features and the change log are available at:

   <http://sourceforge.net/project/shownotes.php?release_id=119576>

-Edward


From guido@python.org  Thu Oct 31 03:02:41 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 30 Oct 2002 22:02:41 -0500
Subject: [Doc-SIG] epydoc 1.1 release
In-Reply-To: Your message of "Wed, 30 Oct 2002 22:01:38 EST."
 <3DC09D12.9060804@gradient.cis.upenn.edu>
References: <3DC09D12.9060804@gradient.cis.upenn.edu>
Message-ID: <200210310302.g9V32fK22735@pcp02138704pcs.reston01.va.comcast.net>

> I just released version 1.1 of epydoc.  Epydoc is a tool for generating
> API documentation for Python modules, based on their docstrings.
> 
>      <http://epydoc.sourceforge.net/>

Would you mind comparing epydoc to the standard pydoc.py?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From edloper@gradient.cis.upenn.edu  Thu Oct 31 04:21:28 2002
From: edloper@gradient.cis.upenn.edu (Edward Loper)
Date: Wed, 30 Oct 2002 23:21:28 -0500
Subject: [Doc-SIG] epydoc 1.1 release
References: <3DC09D12.9060804@gradient.cis.upenn.edu> <200210310302.g9V32fK22735@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3DC0AFC8.4030901@gradient.cis.upenn.edu>

Guido van Rossum wrote:
> Would you mind comparing epydoc to the standard pydoc.py?

I compared epydoc to a number of other projects (pydoc is #1) at:

     <http://epydoc.sourceforge.net/relatedprojects.html>

But the short answer is that I see 3 main differences between epydoc
and pydoc:

   1. Epydoc produces html output that looks more professional and
      is easier to read and navigate.
   2. Epydoc supports the epytext markup language, which can be used to
      format docstrings, and to add information about specific fields,
      such as parameters and instance variables.  (But note that use of
      epytext is not required.)
   3. Pydoc provides excellent command-line (man-page style) and
      interpreter (pydoc.help) interfaces.

Pydoc is a great tool, and I would *not* advocate replacing it with
epydoc.  (Although I certainly wouldn't object to epydoc getting added
to the standard library, if enough people like it.  :)  Alternatively,
I wouldn't object to pydoc and epydoc getting merged, but it would be
quite an undertaking, because they're fairly different under the
hood.)

To be more specific, these are some of the important differences I see
between epydoc and pydoc:

- The output produced by epydoc is easier to read and navigate (in
   my opinion, anyway).
       - Epydoc produces a frames-based table of contents.
       - Epydoc provides a "show/hide private" toggle button.
       - Epydoc creates a "trees" page with class & module
         hierarchies.
       - Epydoc Creates an "index" page with term & identifier
         indices.
       - Epydoc includes a "help" page.
       - Epydoc includes a "breadcrumbs list" in the navigation bar,
         with pointers to the containing classes/modules/packages.
       - The navigation bar includes links to the "top" page and
         to the project's homepage.
       - Epydoc documents each class on its own page.
       - Epydoc uses external css stylesheets to allow for more
         customizable output.
       - Functions, methods, and variables are described with
         a shorter summary table and a longer details list.
       - Epydoc parses builtin function signatures.
       - Variable details includes variable type, optional description,
         and colorized value.
       - Lists of known subclasses, base class trees, etc.
       - Classes are divided into normal classes and exceptions.
       - Pydoc's layout wastes a lot of horizontal space.

- Epydoc supports use of the epytext markup language.
       - Epytext can be used to document parameters, variables, etc.
       - Inline markup can be used to mark italics, bold, monospace,
         documentation links, URLs, index terms, etc.
       - Epytext can be used to create ists, sections, and literal
         blocks.
       - Epytext colorizes doctest blocks.

- Epydoc can be used to check documentation completeness.

- Epydoc has a graphical interface (e.g. for windows users).

- Epydoc is fairly robust (e.g., it can document Zope 3).  I haven't
   actually tested the robustness of pydoc, though.

- Epydoc inherits documentation for undocumented methods whose
   signatures match the base class method.  (This can be disabled by
   adding a blank docstring to the undocumented method).

- Some advantages of pydoc are:
       - It provides links to the source code for each module.
       - It can be used from the command-line to view manpage-like
         docs.
       - It can be used from within python (pydoc.help)
       - It automatically creates intra-documentation links (you
         might see this as a positive or a negative, since it
         sometimes creates links where there shouldn't be links;
         epydoc is more conservative, and will only create links
         if you tell it to (with epytext markup).
       - It currently has better support for python 2.2-style
         types (with wrapper_descriptors, etc.).
       - It does some processing of comments.  Epydoc just uses
         docstrings.

I think that the best way to see the differences between their output is 
to navigate around the docs produced by each tool for the same code. 
The docs for the Python 2.2 standard library, from each tool, are at:

    pydoc: <http://web.pydoc.org/2.2/>
   epydoc: <http://epydoc.sourceforge.net/stdlib/>

Also, you might take a look at some of the docstrings written using 
epytext (e.g., see the source code for epydoc itself), and the 
documentation produced by those docstrings; or you could just look at "A 
Brief Introduction to Epytext" for a quick example:

     <http://epydoc.sourceforge.net/epytextintro.html>

-Edward


From python-doc@zesty.ca  Thu Oct 31 05:48:31 2002
From: python-doc@zesty.ca (Ka-Ping Yee)
Date: Wed, 30 Oct 2002 21:48:31 -0800 (PST)
Subject: [Doc-SIG] Re: epydoc 1.1 release
In-Reply-To: <3DC0AFC8.4030901@gradient.cis.upenn.edu>
Message-ID: <Pine.LNX.4.44.0210302116200.1037-100000@ziggy>

Edward Loper wrote:
> I think that the best way to see the differences between their output is
> to navigate around the docs produced by each tool for the same code.
> The docs for the Python 2.2 standard library, from each tool, are at:
>
>     pydoc: <http://web.pydoc.org/2.2/>
>    epydoc: <http://epydoc.sourceforge.net/stdlib/>

Nice work, Edward!  The output from epydoc is very beautiful.  The
pages produced by pydoc could be improved, though i think the major
differences between them come from a difference in design intent.
Put another way, epydoc and pydoc try to satisfy different constraints.

Here are pydoc's constraints so you can see what it was trying to achieve:

    (a) It tries to stick to "one module -> one file".
    (b) It tries not to present the same information twice.
    It tries to minimize dependencies...
        (c) on auxiliary files
        (d) on browsers
        (e) on code formatting

You may or may not agree that these constraints were good choices;
perhaps they seem extreme to you.  (There's a general philosophy of
minimalism at work here: i wanted the viewer to be able to see a lot
as quickly as possible.  It's also nice to be able to update the
doc file for a single module when you edit it, without having to
regenerate everything.)  epydoc relaxes some of these constraints,
and capitalizes on them to provide more functionality.

So, to revisit Edward's list of differences:

>       1. Epydoc produces a frames-based table of contents.
>       2. Epydoc provides a "show/hide private" toggle button.
>       3. Epydoc creates a "trees" page with class & module
>          hierarchies.
>       4. Epydoc Creates an "index" page with term & identifier
>          indices.
>       5. Epydoc includes a "help" page.
>       6. Epydoc includes a "breadcrumbs list" in the navigation bar,
>          with pointers to the containing classes/modules/packages.
>       7. The navigation bar includes links to the "top" page and
>          to the project's homepage.
>       8. Epydoc documents each class on its own page.
>       9. Epydoc uses external css stylesheets to allow for more
>          customizable output.
>      10. Functions, methods, and variables are described with
>          a shorter summary table and a longer details list.
>      11. Epydoc parses builtin function signatures.
>      12. Variable details includes variable type, optional description,
>          and colorized value.
>      13. Lists of known subclasses, base class trees, etc.
>      14. Classes are divided into normal classes and exceptions.
>      15. Pydoc's layout wastes a lot of horizontal space.
[...]
>      16. Epydoc supports use of the epytext markup language.

Some of these can be explained in terms of the differing constraints;
others are just deficiencies (missing features in pydoc).

Why doesn't pydoc do...

     1? Because of (d).
     2? Because of (a).
     3? Instead of one big page of trees, pydoc has little class
        trees on each module's page.
     4? Missing feature.
     5? Missing feature / didn't think it would be necessary.
     6? pydoc does do this (breadcrumb links are in the header bar).
     7? pydoc does do this (index link is in the header bar).
     8? Because of (a).
     9? Because of (c).
    10? Because of (b).
    11? Missing feature / didn't know there was an established convention.
    12? (e): didn't want to impose a standard for describing variables.
        In most cases the type is redundant (the type is evident from
        the repr) and pydoc tends to be minimal about the use of space.
    13? Missing feature.
    14? Missing feature.
    15? The bars on the left were intentionally placed there to provide
        context (as you scroll down a long page, it may not be visible
        what section you're in).  You could say they're too fat though.
    16? Because of (e).


-- ?!ng


From edloper@gradient.cis.upenn.edu  Thu Oct 31 06:48:49 2002
From: edloper@gradient.cis.upenn.edu (Edward Loper)
Date: Thu, 31 Oct 2002 01:48:49 -0500
Subject: [Doc-SIG] Re: epydoc 1.1 release
References: <Pine.LNX.4.44.0210302116200.1037-100000@ziggy>
Message-ID: <3DC0D251.2090809@gradient.cis.upenn.edu>

Ka-Ping Yee wrote:
> Here are pydoc's constraints so you can see what it was trying to achieve:
> 
>     (a) It tries to stick to "one module -> one file".
>     (b) It tries not to present the same information twice.
>     It tries to minimize dependencies...
>         (c) on auxiliary files
>         (d) on browsers
>         (e) on code formatting

Just for comparison, some of epydoc's constraints were:

   - pretty/easy-to-navigate html output.
   - support for documenting "fields" (parameters, variables, etc).
   - a markup language that is very simple and clean, and has no hidden
     "gotcha" cases.
   - a markup language that is powerful enough for most people's needs
     when writing API docs.  (Well, at least for *my* needs :) )
   - robustness.
   - minimized dependencies on browsers (note that epydoc output
     looks quite good under text browsers like links, and old versions
     of netscape/ie).
   - maximized information density (though perhaps not as strongly
     maximized as it is for pydoc).

> You may or may not agree that these constraints were good choices;
> perhaps they seem extreme to you.  (There's a general philosophy of
> minimalism at work here: i wanted the viewer to be able to see a lot
> as quickly as possible.  

I can appreciate minimalism, and I stand by my statement that pydoc is a 
great tool that fills a very useful niche.  I use it for its 
manpage-style output all the time.

 >>      1. Epydoc produces a frames-based table of contents.
 >      1? Because of (d). [d=no dependency on browsers]

But note that the use of frames is totally optional for the viewer.

>>      3. Epydoc creates a "trees" page with class & module
>>         hierarchies.
 >      3? Instead of one big page of trees, pydoc has little class
 >         trees on each module's page.

This makes it harder to see how classes defined in different modules 
relate to each other.

 >>       2. Epydoc provides a "show/hide private" toggle button.
 >      2? Because of (a). [a=one module/file]
 >>      8. Epydoc documents each class on its own page.
 >      8? Because of (a). [a=one module/file]

What's the reasoning behind the one module/file criteria?  I decided to 
put each class and method on its own page, because they seemed to be 
about the right sized conceptual "chunk."  Also, this means that the 
"nesting" of objects on any given page is just 1-deep (modules->vars, 
modules->classes, classes->methods, classes->vars, etc.), whereas one 
module/file gives 2-deep nesting (modules->classes->methods, etc).

 >>      9. Epydoc uses external css stylesheets to allow for more
 >>         customizable output.
 >      9? Because of (c). [c=no dependance on auxilliary files]

The stylesheet can be safely ignored, and the pages still come out 
looking pretty nice.  Is the reasoning behind this that you want to be 
able to grab a single html file by itself, and copy it somewhere?  This 
suggests that one difference between pydoc and epydoc is that I think of 
the set of docs created by epydoc as a single coherent whole (that 
shouldn't every really be split up), whereas it seems like you think of 
the docs created by pydoc as a set of related but independant files.

 >>     10. Functions, methods, and variables are described with
 >>         a shorter summary table and a longer details list.
 >     10? Because of (b). [b=no repetition of information]

That seems pretty reasonable, but if the docstrings get long, it can 
make it hard to scan through and quickly see what a module/class provides.

 >>     11. Epydoc parses builtin function signatures.
 >     11? Missing feature / didn't know there was an established
 >         convention.

I seem to remember seeing a convention written in the python style guide 
somewhere that builtin functions should start with a 1-line signature 
(since the signature can't be divined via inspection).  This convention 
is certainly followed by __builtin__, sys, os, os.path, etc.

Feel free to rip out my algorithm and adapt it to your own code.  It's 
in epydoc.objdoc.FuncDoc._init_builtin_signature, on line 1313 of 
epydoc/objdoc.py.  It currently handles just about everything except for 
"zip" which I argue doesn't quite follow the normal conventions:

     zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]

(I think my algorithm would recognize it with another comma either 
before or after the inner "[".)

On the subject of ripping code from epydoc, there's other code that you 
might want to rip for inspect.py.  E.g., see 
epydoc.uid._find_builtin_obj_module and 
epydoc.uid._find_function_module, which are more robust than the 
corresponding functionality provided by inspect.py.

>>     12. Variable details includes variable type, optional description,
>>         and colorized value.
 >     12? (e): didn't want to impose a standard for describing
 >         variables. [e=minimize dependencies on code formatting]

I think this falls under the category of things you can do with fields, 
if you want to allow fields (which pydoc doesn't, for reasonable reasons).

 >         In most cases the type is redundant (the type is evident from
 >         the repr)

I think that type info can be pretty useful in some cases -- it's not 
always apparent from the repr.  Also, this lets me provide a link to the 
type, when it's a class.

 >         and pydoc tends to be minimal about the use of
 >         space.

I find epydoc's representation of variables much easier to read 
(multiline strings, colorized regexps, etc), but there's certainly no 
question that pydoc's representation is more compact. :)

>>     15. Pydoc's layout wastes a lot of horizontal space.
 >     15? The bars on the left were intentionally placed there to
 >         provide context (as you scroll down a long page, it may not
 >         be visible what section you're in).  You could say they're
 >         too fat though.

Yeah, I think they're too fat.  And when viewing the docs in text 
browsers, they're just dead space.

 >>     16. Epydoc supports use of the epytext markup language.
 >     16? Because of (e). [=no dependancy on code formatting]

I think that this is a significant difference in goals for the two 
projects.  But as I said on my related projects page, I think this may 
be one of the reasons that pydoc was able to become widely accepted.  Of 
course, epydoc will treat all docstrings as plaintext if you tell it to. 
  (Well, to be precise, if you use "--docformat plaintext", then the 
format for docstrings will default to plaintext, unless overridden on a 
per-module basis by the __docformat__ variable.)

-Edward


From python-doc@zesty.ca  Thu Oct 31 07:15:45 2002
From: python-doc@zesty.ca (Ka-Ping Yee)
Date: Wed, 30 Oct 2002 23:15:45 -0800 (PST)
Subject: [Doc-SIG] Re: epydoc 1.1 release
In-Reply-To: <3DC0D251.2090809@gradient.cis.upenn.edu>
Message-ID: <Pine.LNX.4.44.0210302307050.1037-100000@ziggy>

On Thu, 31 Oct 2002, Edward Loper wrote:
>    - a markup language that is very simple and clean, and has no hidden
>      "gotcha" cases.

I like the simplicity of epytext.

>  >>      1. Epydoc produces a frames-based table of contents.
>  >      1? Because of (d). [d=no dependency on browsers]
>
> But note that the use of frames is totally optional for the viewer.

Oh, i didn't realize that.  Well done.

> What's the reasoning behind the one module/file criteria?  I decided to
> put each class and method on its own page, because they seemed to be
> about the right sized conceptual "chunk."

I guess it just made sense to me at the time not to have too many files.
Navigating with the scroll bar is faster than loading a new page.  It
seemed convenient to have module-level functions and small utility
classes kept together with the classes that use them.  But i see good
arguments both ways; in the end it's just a judgement call.

> The stylesheet can be safely ignored, and the pages still come out
> looking pretty nice.  Is the reasoning behind this that you want to be
> able to grab a single html file by itself, and copy it somewhere?  This
> suggests that one difference between pydoc and epydoc is that I think of
> the set of docs created by epydoc as a single coherent whole (that
> shouldn't every really be split up), whereas it seems like you think of
> the docs created by pydoc as a set of related but independant files.

Yeah, exactly.  I didn't want to deal with tracking dependencies among
the files to figure out what to update when a module was changed, and
it seemed wasteful to redo everything.  If i were to write pydoc today,
i'd probably use a stylesheet, though.  CSS support has improved a lot.

>  >>     10. Functions, methods, and variables are described with
>  >>         a shorter summary table and a longer details list.
>  >     10? Because of (b). [b=no repetition of information]
>
> That seems pretty reasonable, but if the docstrings get long, it can
> make it hard to scan through and quickly see what a module/class provides.

Yes, the summary tables are quite nice.


-- ?!ng


From alt@artisan.com  Thu Oct 31 08:31:00 2002
From: alt@artisan.com (Albert Ting)
Date: Thu, 31 Oct 2002 00:31:00 -0800
Subject: [Doc-SIG] controlling sections levels
References: <15808.15186.324167.572838@lassen.artisan.com>
 <B9E6050E.2AFBC%goodger@users.sourceforge.net>
Message-ID: <15808.59972.215198.336254@lassen.artisan.com>

David Goodger writes:
> From: David Goodger <goodger@users.sourceforge.net>
> To: Albert Ting <alt@artisan.com>, <doc-sig@python.org>
> Subject: Re: [Doc-SIG] controlling sections levels
> Date: Wed, 30 Oct 2002 21:52:15 -0500
> 
> Albert Ting wrote:
> > I'm currently pre-processing a file into a reStructuredText format,
> > then call core.publish_string().  There probably is a better way, by
> > writing my own reader, but not sure how Reader class is used.
> 
> If you're not trying to do anything fancy, core.publish_string()
> should be fine.  It's when you need to do special processing that you
> need a custom Reader; see
> http://docutils.sf.net/spec/pep-0258.html#readers for an overview.
> There have been some discussions of Reader classes on the
> docutils-develop list lately
> (http://lists.sf.net/lists/listinfo/docutils-develop).  If you
> describe your goals, I can advise you if you need a Reader.
> 
> And FYI, all the Docutils runtime settings are listed in
> http://docutils.sf.net/docs/tools.html#configuration-file-entries .
> 

What I did was write a Q&D cgi script that pre-processes an emacs-outline style
text file into the reStructureText format.  


From mal@lemburg.com  Thu Oct 31 08:59:10 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 31 Oct 2002 09:59:10 +0100
Subject: [Doc-SIG] epydoc 1.1 release
References: <3DC09D12.9060804@gradient.cis.upenn.edu>	<200210310302.g9V32fK22735@pcp02138704pcs.reston01.va.comcast.net> <3DC0AFC8.4030901@gradient.cis.upenn.edu>
Message-ID: <3DC0F0DE.808@lemburg.com>

Edward Loper wrote:
> - Some advantages of pydoc are:
>       - It provides links to the source code for each module.
>       - It can be used from the command-line to view manpage-like
>         docs.
>       - It can be used from within python (pydoc.help)
>       - It automatically creates intra-documentation links (you
>         might see this as a positive or a negative, since it
>         sometimes creates links where there shouldn't be links;
>         epydoc is more conservative, and will only create links
>         if you tell it to (with epytext markup).
>       - It currently has better support for python 2.2-style
>         types (with wrapper_descriptors, etc.).
>       - It does some processing of comments.  Epydoc just uses
>         docstrings.

I like the output of epydoc a lot (except maybe for the dim
colors ;-). Wouldn't it be possible to add most of the above
in form of options to epydoc ?

What I don't understand about epydoc is why it uses a syntax
that's almost JavaDoc-style, but not all the way ?

Think of it this way: Java programmers are usually very aware
of JavaDoc style comments, so switching to epydoc for Python
programming would probably cause them more trouble due to the
subtle differences than someone who has never worked in this
context before.

Anyway, just a suggestion. Is the doc-string parser pluggable ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/


From edloper@gradient.cis.upenn.edu  Thu Oct 31 09:30:56 2002
From: edloper@gradient.cis.upenn.edu (Edward Loper)
Date: Thu, 31 Oct 2002 04:30:56 -0500
Subject: [Doc-SIG] epydoc 1.1 release
References: <3DC09D12.9060804@gradient.cis.upenn.edu>	<200210310302.g9V32fK22735@pcp02138704pcs.reston01.va.comcast.net> <3DC0AFC8.4030901@gradient.cis.upenn.edu> <3DC0F0DE.808@lemburg.com>
Message-ID: <3DC0F850.7010300@gradient.cis.upenn.edu>

M.-A. Lemburg wrote:
> Edward Loper wrote:
> 
>> - Some advantages of pydoc are:
 >>     [...]
> 
> I like the output of epydoc a lot (except maybe for the dim
> colors ;-). Wouldn't it be possible to add most of the above
> in form of options to epydoc ?

For some of these, they would be added (as defaults, not options) if I 
had time to code them.  And there's plenty more on the epydoc todo list 
(see the comment at the bottom of epydoc.py/__init__.py).

Others don't really go with epydoc's design philosophy.  In particular, 
I doubt epydoc will ever automatically (implicitly) create intra-doc 
links.  This can sometimes make mistakes, and puts links all over the 
place.  I would rather have the user explicitly create links.  I'm also 
unlikely to add support for processing python comments.  And I doubt 
I'll add manpage-style and interpreter (pydoc.help) usage, because pydoc 
already does such a good job at it, and is already part of the standard 
library.

> What I don't understand about epydoc is why it uses a syntax
> that's almost JavaDoc-style, but not all the way ?

Actually, the only real similarity between epytext and javadoc comments 
is that the @field's look roughly similar.  E.g., note that you have to 
use explicit <p>'s in javadoc to mark paragraph boundaries; and you have 
to explicitly use <ul><li></ul> for lists, etc.

I find javadoc's markup conceptually ugly.  The idea of allowing 
unrestricted html code in your docstring really bothers me.  And it 
makes the docstrings very difficult to read when you're looking at the 
source code.  That said, it might be good to add support for 
javadoc-style docstrings, just because it would reduce the learning 
curve for java programmers.  It wouldn't be that technically difficult 
to do; javadoc docstrings are basically just raw html plus @field's. 
And epydoc's docstring processing is pretty compartmentalized.  But I 
only have limited time to spend on epydoc, and that's not a feature that 
I feel very motivated to add.

If someone else wants to add it, I'd certainly accept a patch.  What 
would probably be involved is:

   - Write epydoc/javadoc.html to parse javadoc-style comments.  It
     would probably produce an xml document with a <javadoc> node
     that contains a <rawhtml> node followed by a <fieldlist> node
     similar to epytext's.  Of course, if you wanted to handle
     javadoc's syntax for intradocumentation links, etc, you would
     need to do a little more work.
   - Patch ObjDoc.__parse_docstring in epydoc/objdoc.py to recognize
     'javadoc' as a value for __docformat__.
   - Patch HTML_Formatter._dom_to_html_helper in epydoc/html.py to
     handle <rawhtml> elements.
   - (Optionally) add all of the field's that javadoc implements that
     epydoc does not (e.g., @since and @depreciated).

Then you could just use "--docformat javadoc" to set the default 
docstring format to javadoc, or add "__docformat__='javadoc'" to each 
module that uses javadoc-style docstrings.

> Think of it this way: Java programmers are usually very aware
> of JavaDoc style comments, so switching to epydoc for Python
> programming would probably cause them more trouble due to the
> subtle differences than someone who has never worked in this
> context before.

I agree that this would reduce the learning curve for java programmers. 
  And it might help make things more consistant for API docs of jython 
programs.  But as I said, I think that javadoc comments are ugly. :)

-Edward


From willg@bluesock.org  Thu Oct 31 14:38:04 2002
From: willg@bluesock.org (will)
Date: Thu, 31 Oct 2002 08:38:04 -0600 (CST)
Subject: [Doc-SIG] epydoc 1.1 release
In-Reply-To: <3DC0F0DE.808@lemburg.com>
Message-ID: <Pine.LNX.4.44.0210310832280.19382-100000@www.bluesock.org>

On Thu, 31 Oct 2002, M.-A. Lemburg wrote:
> 
> Think of it this way: Java programmers are usually very aware
> of JavaDoc style comments, so switching to epydoc for Python
> programming would probably cause them more trouble due to the
> subtle differences than someone who has never worked in this
> context before.

>From the trenches, I've been doing Java development professionally since
late 98.  I've also been doing Python development since early 99 or so.  

It took me 2 days to overhaul my Python project's API documentation to use
Epytext with the @param things and I don't have any problem in keeping
Javadoc and Epydoc tags separate in my mind.  There are only a handful I
use in each group, so it's pretty easy to keep straight.  They could
easily fit on a post-it note that could be stuck to your monitor.

This isn't to say that there doesn't exist a group of folks who will get 
confused between the two, but I didn't.

/will