From goodger@users.sourceforge.net  Wed Sep  5 04:29:48 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 04 Sep 2001 23:29:48 -0400
Subject: [Doc-SIG] reStructuredText tables are done
Message-ID: <B7BB146B.1719B%goodger@users.sourceforge.net>

At long last, the reStructuredText parser understands tables. Once I figured
out how to get it to "see" the individual cells, it was pretty
straightforward. (Basically, we have a queue of upper-left corners, starting
with (0,0). We trace out one rectangular cell, remember it, and add its
upper-right and lower-left corners to the queue of potential upper-left
corners of further cells. Process the queue in top-to-bottom order, keeping
track of how much of each text column has been seen. Elementary. ;-)

CVS has the files, and tonight's snapshots will have them too:

- reStructuredText code & spec:
  http://structuredtext.sourceforge.net/rst-snapshot.tgz
- DPS code & spec (required for the above):
  http://docstring.sourceforge.net/dps-snapshot.tgz

In other recent developments, Garth Kidd's test refactoring has been checked
in. No more 6000-line test file. (Instead, scads of smaller test modules.)

Except for a few details (such as an API for directives), all parser
constructs are complete. It's ready for serious testing; please pound on it
mercilessly.

I received a bug report (along with support files & patch suggestions; great
stuff!) from Remi Bertholet, which I'll check out shortly. Thanks Remi!

Still lots of work to do, documenting the parser, cleaning parts of it up,
working out the DPS APIs, figuring out what Tony's been up to, updating the
DPS PEPs, etc. etc. Any and all contributions gratefully accepted!
 
-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Wed Sep  5 10:00:25 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 5 Sep 2001 10:00:25 +0100
Subject: [Doc-SIG] reStructuredText tables are done
In-Reply-To: <B7BB146B.1719B%goodger@users.sourceforge.net>
Message-ID: <005e01c135e9$33cd01d0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote (wrt tables):
> Elementary. ;-)

Yes, like I believe you. Congrats - that's a seriously neat bit of work
to have done.

> Except for a few details (such as an API for directives), all parser
> constructs are complete.

Yeah!

> figuring out what Tony's been up to,

Nothing *too* complex, luckily (certainly not compared to parsing
tables).

Another task to add to the list (albeit at a low level) is updating the
DTDs (but I'd appreciate waiting until I've done more of my work first,
as that's a significant contribution to the Python specific DTD), and
considering XMLSchema / TREX (or maybe RELAX NG) representations as
well.

Tibs (garish web pages `R` us)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Thu Sep  6 20:34:38 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Thu,  6 Sep 2001 15:34:38 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010906193438.C242828845@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Documentation for 2.2 alpha 3.


From goodger@users.sourceforge.net  Sat Sep  8 02:52:12 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 07 Sep 2001 21:52:12 -0400
Subject: [Doc-SIG] Scoping for implicit link targets? (was Re: Query)
In-Reply-To: <31199EAEAA1FD511969800A0C9AA979C1CA2FB@landis-gyr.landis-gyr.com>
Message-ID: <B7BEF20B.17333%goodger@users.sourceforge.net>

I repeat the original message in its entirety for the benefit of the Doc-SI=
G
members. How about joining us, R=E9mi?

Bertholet R=E9mi wrote:
> Hello,
>=20
> I have a question.
>=20
> If I write the follow text :
>=20
>    Title 1
>    =3D=3D=3D=3D=3D=3D=3D
>=20
>    Introduction
>    ------------
>=20
>    Title 2
>    =3D=3D=3D=3D=3D=3D=3D
>=20
>    Introduction
>    ------------
>=20
> In this case I have an error "Duplicate implicit link name: "introduction=
" "

Actually, it's just a level-0 ("Information") system warning, which is not
intended to be reported, except under some kind of "strict" mode. Level-0
warnings are intended to be removed from the final output. Unless you
explicitly ask to see them, they won't bother you.

> The title "introduction" appear twice but under title. Why the link name =
is
> not composed with all level of sections names. Example "title1_introducti=
on"
> and "title2_introduction", in this case the name will not be duplicated ?=
.
>=20
> Small proposal:
> By default, the links names definitions are only composed by the title na=
me
> (or under title name) (compatibility with existing text), in this case yo=
ur
> tools search the right link name. If the tools detect an ambiguity for a
> link, only in this case, the user must add the root title in his link.

When the idea of "implicit link targets" came up, I considered some kind of
scoping scheme like this, but decided it was too complex and error-prone.
Most of the time, a title's implicit link will not be referenced, especiall=
y
not for repeating titles like this. When an implicit link name duplicates
another link name (implicit or explicit), all duplicate implicit link names
are disabled. If you *do* reference a disabled link name (example: ``See th=
e
Introduction_``), there will be no target, and this will generate a stronge=
r
(probably level-2 or "error") warning, which *will* show up in the output o=
r
in the processing. The level-2 warning will be generated by a later stage o=
f
the DPS processing, after the parser is finished its work.

If you need a hyperlink reference in such a case, you must declare the link
target explicitly, with something like ``.. Title 2 Introduction:`` before
the title you want to reference, and ``See the `Title 2 Introduction`_`` fo=
r
the reference.

I think requiring explicit targets is a small price to pay, compared to the
complexity of the alternative.

> Best regard
>=20
> R=E9mi BERTHOLET ;-)

Je vous remercie encore pour votre contribution.
(Although I am a native of Montreal, my French is very rusty so that's all
I'm going to attempt! ;-)

--=20
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Sat Sep  8 03:49:44 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 07 Sep 2001 22:49:44 -0400
Subject: [Doc-SIG] DPS DTDs
In-Reply-To: <001601c131fe$80f7eb30$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7BEFF87.17409%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote in "RE: [Doc-SIG] DPS and DOM trees":
> OK. Eventually we need to have direct documentation in there on how it
> all hangs together - the DTD is not enough

True. It needs semantic/usage documentation. I tried to be quite verbose and
explicit with tag names (<paragraph>, <bullet_list>, etc.), but that's not
enough. Ask ten SGML/XML experts what a "paragraph" is and you'll probably
get at least 3.14159265 different answers.

> (indeed, is it still meant to be correct?).

Yes, I do try to keep the DTDs up-to-date with the internal data structure.

Tony J Ibbs (Tibs) wrote in "RE: [Doc-SIG] reStructuredText tables are
done":
> Another task to add to the list (albeit at a low level) is updating the
> DTDs

How do you mean?

> (but I'd appreciate waiting until I've done more of my work first,
> as that's a significant contribution to the Python specific DTD)

Could you explain the contribution to the DTD? (I haven't read your pydps
modules yet, so maybe I'd best be quiet.)

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Mon Sep 10 10:35:29 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 10 Sep 2001 10:35:29 +0100
Subject: [Doc-SIG] DPS DTDs
In-Reply-To: <B7BEFF87.17409%goodger@users.sourceforge.net>
Message-ID: <00a501c139db$edbe7e70$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote (in reply to me):
> > (but I'd appreciate waiting until I've done more of my work first,
> > as that's a significant contribution to the Python specific DTD)
>
> Could you explain the contribution to the DTD? (I haven't
> read your pydps modules yet, so maybe I'd best be quiet.)

Heh, asking questions is a Good Thing!

OK - pydps has three phases (broadly speaking):

1. Acquire a parse tree from a package/module and shove
   it into Jeremy's `compiler` tree structure.

2. Take the information from that tree and create a
   DPS nodes tree therefrom.

3. Take the information from *that* tree and produce
   HTML from it (currently in a rather naff manner for
   speed and to prove the principle).

Obviously the structure of tree 1 is fixed (by the `compiler` module).
The structure of tree 2 is fixed *for the docstrings*, so that bit is
easy. The structure for the "Python" parts of the tree is not fixed -
that is, you wrote a DTD for it, but I'm afraid I've strayed from it
rather (erm, yes, well), and am building a new ad-hoc structure instead,
that seems to fit three broad criteria (isn't "3" such a good number!):

i. It vaguely builds outward from the DPS node tree structure.

ii. Its XML representation seems to me to be vaguely reasonable
    (to the extent that I understand such things) - this isn't
    a *requirement* of the DPS system, but has obviously
    informed your design of the DPS node tree as well).

iii. It's not too hard to generate HTML (and, of course, in the
     back of my mind, LaTeX, reST, and any other odd formats one
     might want).

(i) is the vaguest of these, and you'll (eventually) have to be the
ultimate judge on that. (ii) is based to some extent on my experiences
in the GML world, and mainly comes down to [a] not being scared to have
nested elements and [b] trying to decide when an attribute is sensible
instead of an element. (iii) is mostly to do with when things are a list
element containing "atomic" elements. They all sort of play in the same
direction.

What this means is that the schema is *not* written down anywhere except
implicitly in the code (and in my head), and at some point I need to
write the appropriate XML schema and generate a DTD from it.

As a simple example, here is something that shows what I was working on
last night. Given the Python::

    a = 'b'
    class Fred:
        """A *silly* demonstration."""
        def __init__(self, b=1, c='jim', d=None, f={'a':1,a:1},
                     g=[x for x in [1,2,3] if x > 2]):
            self.list = g

we can produce the XML tree (using the normal methods of doing such from
a DPS nodes tree)::

  <?xml version="1.0" ?>
  <document>
    <py_module filename="U:\reST\pydps\testsimp.py"
               fullname="testsimp" name="testsimp">
      <py_namelist>
        <py_name name="a"/>
      </py_namelist>
      <py_class fullname="testsimp.Fred" name="Fred">
        <py_docstring>
          <paragraph>
            A
            <emphasis>
              silly
            </emphasis>
             demonstration.
          </paragraph>
        </py_docstring>
        <py_method fullname="testsimp.Fred.__init__"
                   name="__init__">
          <py_namelist>
            <py_name name="list"/>
          </py_namelist>
          <py_param_list>
            <py_param>
              self
            </py_param>
            <py_param>
              b=1
            </py_param>
            <py_param>
              c='jim'
            </py_param>
            <py_param>
              d=None
            </py_param>
            <py_param>
              f={'a':1,a:1}
            </py_param>
            <py_param>
              g=[x for x in [1,2,3] if x &gt; 2]
            </py_param>
          </py_param_list>
        </py_method>
      </py_class>
    </py_module>
  </document>

Typically verbose (this *is* XML), but I think it makes sense. As you
might guess, this weekend has been spent working on representation of
the RHS of assignments (I made the mistake of trying to represent
``restructuredtext/states.py``, which has a Getattr node in one of the
argument lists). That work isn't finished yet (it copes with list
comprehensions, but not with, for instance, multiplication!), but it's
actually a good way of getting a better understanding of how the
compiler module works.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
You said "run as root" and "securely" in the same sentence relating to
CGI. You're funny! -- Ignacio Vazquez-Abrams, on the Python list
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From Paul.Moore@atosorigin.com  Mon Sep 10 10:53:01 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Mon, 10 Sep 2001 10:53:01 +0100
Subject: [Doc-SIG] Producing output
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF6@UKRUX002.rundc.uk.origin-it.com>

This is probably documented somewhere, but I couldn't find it, so I wonder
if I'm asking the wrong questions...

I understand how to write re-structured text documents, or Python docstrings
in that format. So far, so good - thanks to those who put so much work into
the documentation (and particularly the quick reference card).

So I want to use the format. I have two requirements, and I'm not sure how
to proceed:

1. Given a text file in RST format, how do I produce nicely formatted
   output? Ideally, this question should reduce to the form "what
   command line should I type", but I'm willing to accept that this
   could be over-ambitions :-) Note that "nicely formatted" to me
   generally means something like PDF, or something (eg, (La-)TeX),
   which can be transformed into PDF. HTML is *not* what I have in
   mind (although if HTML is the only current output format available,
   I'd be willing to look at the output formatter, to see if I can
   help write the formatter for something better).

2. Given a Python module with RST docstrings, the same question.

Maybe the answer in both cases is "you can't - we haven't got that far yet".
Which is fine, but I'd say that getting some form of output formatter is a
priority, so that you can start getting feedback from the unwashed masses
like me, who are only looking for something they can use... So the question
then becomes, how soon is such a thing likely to exist, so I can start
helping instead of just posting nuisance E-Mails :-)

Paul.


From tony@lsl.co.uk  Mon Sep 10 14:37:25 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 10 Sep 2001 14:37:25 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF6@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <00aa01c139fd$b9fb0730$f05aa8c0@lslp7o.int.lsl.co.uk>

Moore, Paul (erm, Paul Moore) wrote:
> This is probably documented somewhere, but I couldn't find
> it, so I wonder if I'm asking the wrong questions...

No, it's not documented anywhere yet...

> 1. Given a text file in RST format, how do I produce nicely formatted
>    output? ... Note that "nicely formatted" to me
>    generally means something like PDF, or something (eg, (La-)TeX),
>    which can be transformed into PDF. HTML is *not* what I have in
>    mind...

Does XML meet your requirements?

`restructuredtext/tools/quicktest.py` will emit XML, and there are DTDs
around (i.e., somewhere in the source tree) describing it.

I'm not up-to-date enough with the TeX world to know if there are XML to
TeX translators. Does ReportLabs PDF engine take XML as input?

But also see below...

> 2. Given a Python module with RST docstrings, the same question.
>
> Maybe the answer in both cases is "you can't - we haven't got
> that far yet".

Indeed, that is the case.

> Which is fine, but I'd say that getting some form of output
> formatter is a priority, so that you can start getting
> feedback from the unwashed masses like me, who are only
> looking for something they can use...

Oh, we know, and we also want to be able to use the stuff! But time is
only linear...

At the moment, David is concentrating (I believe) on getting all of the
DPS/reST engines working, to do what is in the specs. At the rate he's
going, I don't know how much longer that will take him. Garth Kidd has
done some work as well, I believe, but again on innards.

Outside that effort (and unfortunately not in CVS), I've been working on
some code that parses Python files, extracting useful information (using
Tools/compiler) and producing an "extended" DPS nodes tree that includes
Python source information as well as any docstrings. This is seriously
unfinished (although also almost useful) and currently has output
alternatives XML ('cos it's totally trivial) and HTML (being worked on).
I've chosen HTML because (a) it's a quick fix (LaTeX or PDF would have a
longer generate/consider/change cycle), (b) it's early on the list of
wishes (*lots* of people can "read" HTML) and (c) it's viewable on all
the machines I work on. pydps (for so it is called) can be found at:

	http://www.tibsnjoan.co.uk/reST/pydps.tgz

which is updated every day or so. It doesn't have a decent user
interface (that's niggling at me enough it may change relatively soon),
and still needs a scad more work just to present all of the information
it should, but it's growing rapidly (well, sort of rapidly).

In many ways, the HTML output phase is the least important - the
derivation of a DPS node tree for Python, and the production of *an*
example of an output formatter, is more important (than the specific
format).

Personally (check with David for a more informed opinion!) I'd say that
it would not hurt to have an alternative effort looking at producing
LaTeX or PDF output from a DPS document (I'd advise leaving the
full-blown Python thing until what I'm working on is a bit more mature).
Working from the DPS node tree is quite fun, and we should be able to
stand two independent efforts! I have some ideas on the *form* that an
output parser should take (I like a Writer class that has a __call__
method that does the work), but that's just *my* idea, and I haven't had
feedback from David about what he thinks of this, or how he sees
different formatters integrating into DPS in detail (I suspect he hasn't
*got* detailed ideas yet).

(NB: the *way* that my HTML writer works is *not* something I'd
recommend - it is exceedingly dumb)

It may be worth your joining the DPS and restructured text development
lists (and CVS checking lists) - see

	http://sourceforge.net/mail/?group_id=26626
	http://sourceforge.net/mail/?group_id=7050

(which are linked to off the DPS and reStructuredText home pages).

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From goodger@users.sourceforge.net  Tue Sep 11 05:32:22 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 00:32:22 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <00aa01c139fd$b9fb0730$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C30C14.175B0%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> I'm not up-to-date enough with the TeX world to know if there are
> XML to TeX translators. Does ReportLabs PDF engine take XML as
> input?

Of course, such a translator would need style sheets in order to
understand the DPS XML output. (Forgive me if this is obvious; I
think it isn't obvious to many people). XML is not a single data
structure, it is a syntax for describing data structures. XML is
like the Latin alphabet; necessary to understand written English,
French, Spanish, Python, etc., but not sufficient.

[Paul Moore]
> > Maybe the answer in both cases is "you can't - we haven't got
> > that far yet".
> 
> Indeed, that is the case.
> 
> > Which is fine, but I'd say that getting some form of output
> > formatter is a priority, so that you can start getting
> > feedback from the unwashed masses like me, who are only
> > looking for something they can use...
> 
> Oh, we know, and we also want to be able to use the stuff! But time
> is only linear...

Any and all help is appreciated!

> At the moment, David is concentrating (I believe) on getting all of
> the DPS/reST engines working, to do what is in the specs.

Correct.

> At the rate he's going, I don't know how much longer that will take
> him.

Is that an expression of confidence or of despair? ;->

Seriously, though, the parser code is almost finished (95%). The
parser/DPS interface is also almost finished, at least until there's
another parser besides reStructuredText (at which time its needs
will be accommodated). Tibs is working on Python docstring mode
code, from docstring extraction to final output, but I haven't
examined it properly yet. It should prove a good source of code and
inspiration.

> In many ways, the HTML output phase is the least important - the
> derivation of a DPS node tree for Python, and the production of *an*
> example of an output formatter, is more important (than the specific
> format).

Indeed, that's the motivation for the entire DPS project. Each component is
independent of the others, as much as possible. Once the interface between
components is established, the addition of input or output formats, styles,
or modes, becomes easy.

> Personally (check with David for a more informed opinion!) I'd say
> that it would not hurt to have an alternative effort looking at
> producing LaTeX or PDF output from a DPS document

Definitely. The more the merrier, and the better the end product will
be. Each output format has its own requirements from its input, and
without multiple implementations we can't generalize.

> I have some ideas on the *form* that an output parser should take
> ... [but] I haven't had feedback from David about what he thinks of
> this,

Yes, apologies. The parser itch is close to being completely scratched
(code anyway), after which I'll turn my attention to other itches.
I'll scratch the parser internal docs itch gradually, especially
once the code has proven itself mature.

> or how he sees different formatters integrating into DPS in
> detail (I suspect he hasn't *got* detailed ideas yet).

Your suspicions are once again well founded. We'll take what you've
written, and I'll work on another output format myself, and we'll
determine the API through what works in practice.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Tue Sep 11 05:32:43 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 00:32:43 -0400
Subject: [Doc-SIG] DPS DTDs
In-Reply-To: <00a501c139db$edbe7e70$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C30C2A.175B0%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> Heh, asking questions is a Good Thing!

Yes, the answers often turn into good docs. They certainly help us to
understand each other.

> The structure for the "Python" parts of the tree
> is not fixed - that is, you wrote a DTD for it, but I'm afraid I've
> strayed from it rather (erm, yes, well), and am building a new
> ad-hoc structure instead

Looking at your example, I think I see what you've done. You've combined the
parse tree with the docstring tree, into one unified structure. Correct?

I think this is fine, for internal use, but I don't see the need to add all
the 'py_*' elements to the DTD. I was envisioning a parse tree data
structure carrying docstring 'leaves', each of which gets parsed into a DPS
nodes tree and interpreted according to the namespace context of its part of
the parse tree (e.g., interpreted text "`b`" in Fred.__init__'s docstring
gets resolved to a 'parameter' node, because that's what 'b' is in
Fred.__init__'s namespace). I think that the parse tree data should remain
distinct from the docstring data. The parse tree deserves and requires a
data structure of its own (and this one doesn't need to be anywhere near
XML).

The DTD is intended as the document tree, to be built up out of smaller
trees grafted together using the knowledge gleaned from the parse tree. The
two types of tree represent fundamentally different information. Forcing the
parse tree to share a schema with the document tree is hypergeneralization.

In dps/spec/ppdi.dtd you'll see the "Additional Structural Elements" provide
the major *documentation* structures necessary for describing Python
packages, modules, classes, methods, functions, and their attributes.
They're not intended to hold the parse tree.

I see the parse tree data distilling down to a tree of local namespace
dictionaries, mapping names to objects in the parse tree. Combining the
namespace dicts correctly results in the overall namespaces of each object
in the parse tree.

> (I made the mistake of trying to represent
> ``restructuredtext/states.py``, which has a Getattr node in one of the
> argument lists)

I don't follow. Can you point the way?

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From Paul.Moore@atosorigin.com  Tue Sep 11 09:32:09 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 11 Sep 2001 09:32:09 +0100
Subject: [Doc-SIG] Producing output
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF8@UKRUX002.rundc.uk.origin-it.com>

From: David Goodger [mailto:goodger@users.sourceforge.net]
> Any and all help is appreciated!

OK, looks like I accidentally volunteered myself :-)

> > Personally (check with David for a more informed opinion!) I'd say
> > that it would not hurt to have an alternative effort looking at
> > producing LaTeX or PDF output from a DPS document
> 
> Definitely. The more the merrier, and the better the end product will
> be. Each output format has its own requirements from its input, and
> without multiple implementations we can't generalize.

OK. As my interest is with the output side of things, I guess I should work
on that. So I need to grab the necessary bits of code to take a RST document
and generate the intermediate bits, and look at writing an output backend
onto that.

Sounds fair enough.

I grabbed the latest daily snapshot of the RST and DPS projects. Looks like
that is enough to start with. Is there anything else I need? I just picked
up Tibs' pydps stuff as well. I can't work out (yet) if that's also
relevant.

> > I have some ideas on the *form* that an output parser should take
> > ... [but] I haven't had feedback from David about what he thinks of
> > this,
> 
> Yes, apologies. The parser itch is close to being completely scratched
> (code anyway), after which I'll turn my attention to other itches.
> I'll scratch the parser internal docs itch gradually, especially
> once the code has proven itself mature.

I'll do some scratching of the output itch, but my biggest itch at the
moment is for documentation of the internal phases (parser, intermediate
representation, output?) and the data structures used to communicate between
them. Unfortunately, this feels like one of those itches in the small of
your pack - really annoying, but needs someone else to scratch it for you
:-)

I'll go code-diving and see what I come up with, though.

> > or how he sees different formatters integrating into DPS in
> > detail (I suspect he hasn't *got* detailed ideas yet).
> 
> Your suspicions are once again well founded. We'll take what you've
> written, and I'll work on another output format myself, and we'll
> determine the API through what works in practice.

As I say, I'll look at something, too. Probably (La)TeX, as I can get PDF
from that. But I'm low on tuits, so it may take a while before I get
anything useful...

Paul.


From Paul.Moore@atosorigin.com  Tue Sep 11 09:33:26 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 11 Sep 2001 09:33:26 +0100
Subject: [Doc-SIG] Producing output
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF9@UKRUX002.rundc.uk.origin-it.com>

From: Tony J Ibbs (Tibs) [mailto:tony@lsl.co.uk]
> It may be worth your joining the DPS and restructured text development
> lists (and CVS checking lists) - see
> 
> 	http://sourceforge.net/mail/?group_id=26626
> 	http://sourceforge.net/mail/?group_id=7050
> 
> (which are linked to off the DPS and reStructuredText home pages).

Oh, goody - more mailing lists. Are there downloadable archives (mbox
format, or the like)? The SourceForge Geocrawler archives are hopeless for
browsing to get a feel for what's been going on...

Paul.


From tony@lsl.co.uk  Tue Sep 11 10:07:33 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 11 Sep 2001 10:07:33 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <B7C30C14.175B0%goodger@users.sourceforge.net>
Message-ID: <00af01c13aa1$313bc1b0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote (partly in response to me):
> Of course, such a translator [1]_ would need style sheets in order
> to understand the DPS XML output. (Forgive me if this is obvious; I
> think it isn't obvious to many people). XML is not a single data
> structure, it is a syntax for describing data structures. XML is
> like the Latin alphabet; necessary to understand written English,
> French, Spanish, Python, etc., but not sufficient.

.. [1] [La]TeX, PDF, etc.

or, in my field, the UK national geographic transfer format - people
often ask if we can read/write "NTF" when they *actually* are asking a
much too broad question - they need to specify a format that *uses* NTF.

> > At the rate he's going, I don't know how much longer that will take
> > him.
>
> Is that an expression of confidence or of despair? ;->

Oh, amazement and confidence, don't worry!

> Seriously, though, the parser code is almost finished (95%).

That's what I suspected (well, not the exact percentage value!).

> parser/DPS interface is also almost finished, at least until there's
> another parser besides reStructuredText (at which time its needs
> will be accommodated).

Although I'm not sure how much demand there would be for such a thing...

> Tibs is working on Python docstring mode
> code, from docstring extraction to final output, but I haven't
> examined it properly yet. It should prove a good source of code and
> inspiration.

That was my feeling - it's a prototype, from which the final will come
(possibly from more refactoring).

> > or how he sees different formatters integrating into DPS in
> > detail (I suspect he hasn't *got* detailed ideas yet).
>
> Your suspicions are once again well founded. We'll take what you've
> written, and I'll work on another output format myself, and we'll
> determine the API through what works in practice.

ooh - which one? reST itself, maybe?

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 11 10:09:24 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 11 Sep 2001 10:09:24 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF9@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <00b001c13aa1$73a84dc0$f05aa8c0@lslp7o.int.lsl.co.uk>

> Oh, goody - more mailing lists.

Strangely enough, just what I said when David pointed me at them!

However the traffic level isn't too great, and the checkin messages
*are* worth getting (David's not bad at the summary text, and that
normally lets me know if there's something I care about).

> Are there downloadable archives (mbox format, or the like)?

I'll leave David to answer that...

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 11 10:28:18 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 11 Sep 2001 10:28:18 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF8@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <00b101c13aa4$173ee550$f05aa8c0@lslp7o.int.lsl.co.uk>

Paul Moore wrote:
> OK, looks like I accidentally volunteered myself :-)

Welcome aboard (from someone hanging off the side on a raft at the
moment!).

> I grabbed the latest daily snapshot of the RST and DPS
> projects. Looks like that is enough to start with.

Probably. ``restructuredtext/tools/quicktest.py`` is an example of an
initial "output command". Also, see below.

I found studying dps/dps/nodes.py (is that right?) fairly instructive -
that's the datastructure we're working with.

> Is there anything else I need?

Well, it's worth keeping up to date with dps and reST sources. If you've
got CVS, no problem, but I don't and thus I just grab the snapshots,
well, daily.

It may or may not be relevant, but I'm working on Windows/NT at work,
using Python 2.1, and on Debian at home, using Python 2.2a1 (soon to be
a3). I know David is working on a Mac, and I know it's been agreed not
to support Python 1.5.2 (so += and list comprehensions are OK, but
better not to use generators yet!).

As I said before, I'd recomment working on DPS documents until the
Python parsing is done - that's useful enough!

> I just picked up Tibs' pydps stuff as well.
> I can't work out (yet) if that's also relevant.

OK. Mostly not. pydps.py is a fairly grotty user interface (!) - but the
"text" argument leads to some code that (like quicktest) parses and
presents a DPS text file.

It *might* be worth looking at the *structure* (but not the code!) of
html.py - I think that the use of a baseclass that implements the
formatting of a DPS, and can be subclassed for (e.g.) Python is a Good
Thing (I don't particularly care what it's called). I like the meme of
using a callable class - there are *so many* variables that one *might*
want to change in a formatter, but normally don't, that I think using a
class / class instance to hold them is useful. It means that the actual
``__call__`` can be kept fairly simple, and it also makes it easier to
"reuse" a particular formatter instance for more than one document.

The reason I say that the *code* of html.py might not be relevant is
that it is *very* quick and dirty - it doesn't assume direct knowledge
of the DPS node tree (at the moment) but just, well, hopes that if
``write_html()`` is called on each node, then something good will
happen. Of course, later on some propagation of information round the
code will be needed (if only to get the page title established!), but
quick and dirty gets me something I can work with now.

> I'll do some scratching of the output itch, but my biggest itch at the
> moment is for documentation of the internal phases (parser,
> intermediate representation, output?) and the data structures used to
> communicate between them.

As I said, reading nodes.py is useful. It's sufficiently close to
XML/DOM that you can leverage off any knowledge of that to work out what
is going on. The *advantage* is that David has some more useful methods
in there that are not (necessarily) in DOM.

> As I say, I'll look at something, too. Probably (La)TeX, as I
> can get PDF from that. But I'm low on tuits, so it may take a
> while before I get anything useful...

LaTeX is good because it's another popular format, and we'd need to do
it eventually.
 It's the one I would have probably gone for next (well, I prefer TeX to
LaTeX for this sort of simple document, but whatever).

On the other hand, this could be a really good opportunity to learn how
to use the ReportLab (http://www.reportlab.com/) stuff as well, and
produce PDF directly. Mind you, someone will get round to that someday
(maybe even ReportLab, I suppose, if DPS takes off).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 11 10:34:52 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 11 Sep 2001 10:34:52 +0100
Subject: [Doc-SIG] DPS file extensions
In-Reply-To: <B7C30C2A.175B0%goodger@users.sourceforge.net>
Message-ID: <00b201c13aa5$02161120$f05aa8c0@lslp7o.int.lsl.co.uk>

I've noticed that all of the text files in the DPS and reST
distributions (that is, in the ``spec`` directories) have extension
``.txt``, regardless of whether they are PEPs, DPS/reST text, or (if
there are any!) plain ordinary text.

Whilst this is, of course, correct in a minimalist sense, it isn't very
helpful - running current quicktest/pydps over a PEP doesn't work very
well (well, I assume not), nor is a random text file likely to be
usefully processed (hmm - luck may hold here).

Could I suggest that we coin a standard extension for DPS/reST files?
(I'm willing to cope with a file that is called ``pep-XXX.txt`` as being
recognisably a PEP!).

I *had* thought, initially, of ``.dps``, but of course we *actually*
want to specify the particular format, not the organisational scheme, so
I would thus prefer ``.rest`` (I think going for pseudo-arbitrary
capitalisation in an extension might not be a good thing!).

Thus (in ``dps/spec``, for instance), we'd have::

    dps-notes.rest
    dps.cat
    gpdi.dtd
    pep-0256.txt (and so on - and maybe one dat pep-0256.rest)
    ...
    python-docstring-mode.rest

(we already know that I prefer ``.rest`` to ``.rst``...)

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 11 10:59:50 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 11 Sep 2001 10:59:50 +0100
Subject: [Doc-SIG] DPS DTDs
In-Reply-To: <B7C30C2A.175B0%goodger@users.sourceforge.net>
Message-ID: <00b301c13aa8$7ef5cfc0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Looking at your example, I think I see what you've done.
> You've combined the parse tree with the docstring tree,
> into one unified structure. Correct?

Yep. Or maybe no (I said 'yep' originally, but by the bottom of my reply
I'm not sure).

> I think this is fine, for internal use, but I don't see the
> need to add all the 'py_*' elements to the DTD.

<<Lots of deletia>>

I'll try to explain my point of view, since I'm not sure I see yours(!).

I see that we are producing a "document" (in some sense - normally
actually a *real* document, suitable for printing, or if one must,
viewing with a browser). Thus for a Python package or module, that
document should provide the information about that module. Some of the
components of that information (tree) are indeed docstrings, and some of
those docstrings will be DPS/reST structured. To me it doesn't make
sense to have an artificial boundary at any level - so a docstring that
is not parsed would be <literal> text (is that the right tag?), and one
that is parsed will be full of DPS nodes, and both will occur within a
<py_docstring>. Yes, we *could* say there is a special watermark between
the Python-esque stuff and the docstring innards, but I don't see the
point.

Now, in ``dps/specs`` you have ``ppdi.dtd``, which certainly *looks* to
me as if it is doing the same job as I am doing with my <py_xxx> tags -
that is, extending the DPS nodes tree "outwards" from the docstring into
the Python code. Of course, given my predilections, I may be
misinterpreting your original intent.

My point was simply that I am not, particularly, following that DTD -
but I obviouslY (trivially, since I can output XML!) am following *some*
(virtual) DTD. And it would be nice to write that down (be it as DTD or
XML Schema or whatever) at some point.

> I was envisioning a parse tree data structure carrying docstring
> 'leaves', each of which gets parsed into a DPS nodes tree

I think that's what I've been doing...

> and interpreted according to the namespace context
> of its part of the parse tree

...and that's obviously on my ToDo list! (i.e., putting in all those
nice links we want to have throughout the document.

Note that the fact that the DPS/reST (gosh, I feel like someone
insisting on saying GNU/Linux!) will *contain* references out to the
Python is another reason I feel that this is all one continuous
datastructure - I don't want to distinguish (in my mind) between links
intra-docstring and links between the docstring elements and the Python
code.

> The parse tree deserves and requires a data structure of its own

which it has, in that produced by Jeremy's compiler tool, but in fact
it's quite advantageous to turn that into a DPS node tree fragment as
well.

(hmm - and I just realised *why* - if the two components (inner and
outer, for want of better term) are *discontinuous* in structure, then
it makes it harder to write a Formatter/Writer - it would need to know
about the Python bits and the docstring bits independently - it's *much*
easier (conceptually and implementationally (?yuck)) to have a baseclass
that understands DPS and a subclass that *adds in* how to understand the
"surrounding" Python.

> The DTD is intended as the document tree, to be built up out
> of smaller trees grafted together using the knowledge gleaned
> from the parse tree.

That's the DPS DTD. But there's no reason that the Python DTD shouldn't
reference the DPS DTD - indeed, that's how things are *meant* to work
(at least in the XML Schema world - and that's why I wittered for a bit
about namespaces earlier as well).

Remember, of course, we are *not* producing XML except when we ask the
DPS node tree to do so!

> The two types of tree represent fundamentally different
> information.

I see we disagree - it's all document (erm - serialisation of
information).

> Forcing the parse tree to share a schema with the document
> tree is hypergeneralization.

I really think we might be talking past each other, because what I'm
doing is so simple and obvious that I find it hard to call it
hypergeneralisation - I'm not losing anything, and I'm gaining quite a
lot.

I'm using the compiler parse tree to hold the parse tree, and generating
documenation (as part of a DPS node tree) from it. That's obvious to me.
Are we just confusing each other with words?

> In dps/spec/ppdi.dtd you'll see the "Additional Structural
> Elements" provide the major *documentation* structures necessary
> for describing Python packages, modules, classes, methods, functions,
> and their attributes. They're not intended to hold the parse tree.

They are clearly a start on holding the information one needs to report
on in a document. They didn't do enough for me, which is why I'm not
using them.

However, do note that, at the moment, I am performing a simple serial
"dump" of information (well, it's a tree walk, of course, but you know
what I mean) to produce the HTML (that's why I call it quick-and-dirty),
so I don't see that I'm putting any extraneous information into the DPS
nodes tree.

As to dictionaries and namespaces - that's for the future - at the
moment I'm still trying to represent the fundamentals of the Python
information that we want to output in a helpful document (such as
function signatures).

(ooh, ooh, another argument - if we get Grouch integrated at some future
stage, then it is *quite* likely that tools *will* want to use the
"innards" of a docstring and the "outers" of a function definition in a
coherent manner)

> > (I made the mistake of trying to represent
> > ``restructuredtext/states.py``, which has a Getattr node in
> > one of the argument lists)
>
> I don't follow. Can you point the way?

Oh - sorry, implementation wittering. If one has a function/method
defines as::

   def fun( a = x.y.z):
      ...

then one needs to know how to represent the right hand side of the
assignment, to be able to describe the function signature. And you had
some, erm, interesting function definitions.

Tibs

    (who must stop hacking on this message and
     do some paid work)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From Paul.Moore@atosorigin.com  Tue Sep 11 11:07:55 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 11 Sep 2001 11:07:55 +0100
Subject: [Doc-SIG] Producing output
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AFFD@UKRUX002.rundc.uk.origin-it.com>

From: Tony J Ibbs (Tibs) [mailto:tony@lsl.co.uk]
> On the other hand, this could be a really good opportunity to 
> learn how to use the ReportLab (http://www.reportlab.com/) stuff
> as well, and produce PDF directly. Mind you, someone will get
> round to that someday (maybe even ReportLab, I suppose, if DPS
> takes off).

I thought about that, but I wasn't sure if the extra dependency was the sort
of thing which might go against it. I suppose that for an output formatter,
extra dependencies aren't so bad (if you don't have the ReportLab stuff, you
can't produce PDF directly, so use HTML, or go via TeX, or whatever...)

Paul.


From Paul.Moore@atosorigin.com  Tue Sep 11 11:24:17 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 11 Sep 2001 11:24:17 +0100
Subject: [Doc-SIG] Producing output
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AFFE@UKRUX002.rundc.uk.origin-it.com>

From: David Goodger [mailto:goodger@users.sourceforge.net]
> Tony J Ibbs (Tibs) wrote:
> > I'm not up-to-date enough with the TeX world to know if there are
> > XML to TeX translators. Does ReportLabs PDF engine take XML as
> > input?
> 
> Of course, such a translator would need style sheets in order to
> understand the DPS XML output. (Forgive me if this is obvious; I
> think it isn't obvious to many people). XML is not a single data
> structure, it is a syntax for describing data structures. XML is
> like the Latin alphabet; necessary to understand written English,
> French, Spanish, Python, etc., but not sufficient.

Um. It wasn't obvious to me. And it still isn't. What are we talking about
here? I know very little about XML, and I didn't see XML as of particular
relevance for writing an output formatter. I'm expecting to just be
tree-walking a data structure (which may be a DOM, but who cares?). Style
sheets, namespaces and the like don't fit into my picture at all.

Have I missed something fundamental?

(*Must* get round to reading the code!)

Paul.


From tony@lsl.co.uk  Tue Sep 11 11:47:39 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 11 Sep 2001 11:47:39 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFFE@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <00b601c13aaf$2d6e5da0$f05aa8c0@lslp7o.int.lsl.co.uk>

With respect to XML style sheets and other gumph, Paul Moore wrote:
> Um. It wasn't obvious to me. And it still isn't. What are we
> talking about here? I know very little about XML, and I didn't
> see XML as of particular relevance for writing an output formatter.

Ah, there's history in here.

At one stage, I envisaged that a DOM tree (and thus, XML) would be used
as the "glue" between different phases of things - so the parser would
emit a DOM tree, and the formatter would eat it.

Guido (and others) pointed out that DOM trees can be a bit awkward to
use, and that Python has a perfectly good object model, so why transform
the "internal" tree (for you *would* have to be mad to be constructing
the internal tree in DOM - I know, I tried it) into something else
rather than using it directly.

So that's the route being taken (whether David was ever listening to
what I was saying, and would have gone the XML/DOM route, I don't know.
He's quite sensible, you know...)

*However*, the DPS nodes tree (for such I am talking about) is clearly
heavily influenced by XML/DOM - this gives the significant advantage
that it is trivial to transform it *into* a DOM tree, and thence into
XML (which is itself a useful output format, so we're actually up one
format already).

And in an XML world, DTDs or other schema describing pedagogies are a
useful thing to have around - it's another angle on the documentation of
things.

(indeed, having that similarity also gives one good leverage in
*understanding* what the DPS nodes tree is about - simple things like
the use of tagnames, children, attributes, etc. - terminology, I guess -
what it *misleads* me about I haven't yet found!)

Furthermore, I still think that output of DOM/XML will be useful *some*
of the time - for instance the PyPaSax people may be able to make direct
use of at least the DPS stuff (they have their own equivalent of my
compiler->Python stuff, using the basic AST and SAX - scary stuff!).

> I'm expecting to just be tree-walking a data structure

Exactly.

> (which may be a DOM, but who cares?).

Well, if it were DOM your life would be more complex, but that's about
it.

> Style sheets, namespaces and the like don't fit into my
> picture at all.
>
> Have I missed something fundamental?

No, but I have a tendency to borrow (often inaccurately, I'm afraid)
terminology - and if I were writing an XML schema for our stuff, I would
want to use namespaces to localise stuff, 'cos that's what they're for.
But it's not something to worry about.

(personally, I *like* XML, but not entirely for its <tag>...</tag>
stuff - more for the way it and the related standards give interesting
ways of thinking about a document - both as a tree structure and as a
linear structure, both at the same time. That's why I liked SGML as
well, but I've never *used* SGML, and XML is much simpler, and more
accessible (hmm - can I really claim to have *used* XML?). After all,
XML per-se is just one serialisation of the infoset [1]_)

.. [1] yes, please picture a tongue-in-cheek there.
   Pretty please.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Wed Sep 12 03:13:42 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Tue, 11 Sep 2001 22:13:42 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010912021342.ABF4928845@beowolf.digicool.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Miscellaneous updates, plus documentation for the new "hmac" module
(located in the crypto chapter of the Library Reference).


From goodger@users.sourceforge.net  Wed Sep 12 03:59:27 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 22:59:27 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF8@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <B7C431C4.17662%goodger@users.sourceforge.net>

Moore, Paul wrote:
> OK, looks like I accidentally volunteered myself :-)

Glad to have you.

> I'll do some scratching of the output itch, but my biggest itch at
> the moment is for documentation of the internal phases (parser,
> intermediate representation, output?) and the data structures used
> to communicate between them.

Understood. My documentation itch is getting stronger. Are itches
contagious?

"The itches of the many out-itch the itches of the few -- or the one."

> Unfortunately, this feels like one of those itches in the small of
> your back - really annoying, but needs someone else to scratch it
> for you

Good analogy :-)

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 12 03:59:30 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 22:59:30 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFF9@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <B7C4320B.17663%goodger@users.sourceforge.net>

Moore, Paul wrote:
> Oh, goody - more mailing lists. Are there downloadable archives
> (mbox format, or the like)? The SourceForge Geocrawler archives are
> hopeless for browsing to get a feel for what's been going on...

I don't know of any decent archives. How about petitioning SourceForge
*and* Geocrawler to add them? I will too. I'm sure they've been asked
before, but if everybody pesters them, maybe we'll see results.

I'd be happy to keep archives on the project web sites, if anybody is
willing to create & maintain them.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 12 03:59:31 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 22:59:31 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <00af01c13aa1$313bc1b0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C4325E.17664%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
[David]
> > Your suspicions are once again well founded. We'll take what
> > you've written, and I'll work on another output format myself, and
> > we'll determine the API through what works in practice.

[Tony]
> ooh - which one? reST itself, maybe?

That's a good question. My answer would have to be a definite "I
dunno."

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 12 03:59:31 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 22:59:31 -0400
Subject: [Doc-SIG] Re: DPS file extensions
In-Reply-To: <00b201c13aa5$02161120$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C432DB.17665%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> I've noticed that all of the text files in the DPS and reST
> distributions (that is, in the ``spec`` directories) have extension
> ``.txt``, regardless of whether they are PEPs, DPS/reST text, or (if
> there are any!) plain ordinary text.
...
> Could I suggest that we coin a standard extension for DPS/reST files?
...
> (we already know that I prefer ``.rest`` to ``.rst``...)

I suppose that's possible, but I'd rather not. The reason I gave the
files .txt extensions in the first place was for cross-platform
compatibility. I don't need extensions on my home machine at all. The
Mac treats metadata sensibly; metadata embedded in the filename is a
crude hack that MacOS avoids (or avoided; reportedly, MacOS X seems to
have caved in to peer pressure). But I work on different platforms and
I'd much rather I (and you!) get a nice text file icon than a blank
one.

We'd all have to teach our systems that '.rest' means '.txt'. Most
people won't, and I don't want to add Windows registry fiddling to the
distutils scripts.

reStructuredText is a *form* of plaintext, so the .txt extension is
appropriate.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 12 03:59:31 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 22:59:31 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <00b601c13aaf$2d6e5da0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C435CC.17667%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> Guido (and others) pointed out that DOM trees can be a bit awkward
> to use, and that Python has a perfectly good object model, so why
> transform the "internal" tree (for you *would* have to be mad to be
> constructing the internal tree in DOM - I know, I tried it) into
> something else rather than using it directly.
> 
> So that's the route being taken (whether David was ever listening to
> what I was saying, and would have gone the XML/DOM route, I don't
> know.

Yes, I was listening to both sides os the discussion. Not having used
DOM before, but seeing that it seemed to fit the bill, I started off
using it. Gave up after about 5 minutes though. It's fine for data
coming in from outside, but not for building document trees
programmatically.

> He's quite sensible, you know...)

Why, thank you!

> (indeed, having that similarity also gives one good leverage in
> *understanding* what the DPS nodes tree is about - simple things
> like the use of tagnames, children, attributes, etc. - terminology,
> I guess - what it *misleads* me about I haven't yet found!)

My background includes 2.5 years of intense SGML work, analyzing
documents, writing DTDs, implementing processing systems, etc. So the
terminology comes naturally to me too.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 12 03:59:31 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 22:59:31 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AFFE@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <B7C43551.17666%goodger@users.sourceforge.net>

[Paul, referring to my explanation that XML isn't just one format]
> Um. It wasn't obvious to me. And it still isn't. What are we talking
> about here?

Tony was talking about XML to TeX translators, and ReportLabs PDF
engine taking XML input. I thought people might get the impression
that there is such a thing as a simple "XML to TeX translator", when
there isn't (not without specific support files, anyway). XML itself
provides only the basic building blocks; each application defines its
own set of tags, their grammar and their meaning.

When somebody says "XML" it's not sufficient. They have to specify
their schema or tagset and its grammar, and the semantics too. The
grammar part is done via a DTD (Document Type Definition), which
defines the tags being used and how they relate. The DPS DTD is
provided in the spec directory, in gpdi.dtd, ppdi.dtd, and
soextblx.dtd. We might have a "DPS XML to TeX translator" someday, or
it may be a direct TeX output formatter component for the DPS.

> I know very little about XML, and I didn't see XML as of particular
> relevance for writing an output formatter.

It's not necessarily relevant, you're correct.

> I'm expecting to just be tree-walking a data structure (which may be
> a DOM, but who cares?).

Correct. And when required (or patches submitted), the dps/nodes.py
classes will grow appropriate methods to make this easier.

> Style sheets, namespaces and the like don't fit into my picture at
> all.

Style sheets (XSLT files) are one means of doing translations from XML.
I'm not familiar with XSLT, but the Python standard library has no
XSLT engine yet, so they're not really an option (yet).

> (*Must* get round to reading the code!)

(*Must* get round to writing more docs!)

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 12 04:31:07 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 11 Sep 2001 23:31:07 -0400
Subject: [Doc-SIG] Re: "docutils"
In-Reply-To: <00b401c13aaa$08e7b9e0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C44F3A.17671%goodger@users.sourceforge.net>

[David, in private correspondence]
>> By the way, would you mind if the DPS (or some superset
>> thereof) were to use the name "docutils"?

[Tony, replying]
> By all means.

Great.

> I don't actually think I came up with the name, anyway

Checking the complete Doc-SIG archive... The earliest reference to
"docutils" was by Fred Drake on 2 Dec 1999. The next was indeed the one you
referenced, 27 Nov 2000, from Fred in reply to your "What do we want to
*call* this thing?". It was just after I posted the first draft of
reStructuredText. The earliest reference to "docutil" is in a filename
from the gendoc package, on 23 Jan 1997.

>> I think it's a much more memorable name than "DPS", a mere
>> acronym, and it matches "distutils" nicely. Perhaps
>> "docutils" would be an umbrella package, subsuming the DPS
>> as a backend engine, and exposing a user-friendly collection
>> of tools.
> 
> I think that's a good idea.

Perhaps it's time for a new SourceForge project?

(Only half-joking here.)

> But it does mean I now don't know what to call "pydps" (since "pydoc"
> is already taken).

How about "dps.modes.pythondocstring" or just "dps.modes.docstring" (do you
think anyone will ever implement an Emacs-lisp docstring mode? :-). I think
dps.modes is where much of it will go. Parts may go into a "styles"
subpackage (the one that determines how the raw input gets transformed
stylistically).

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From Paul.Moore@atosorigin.com  Wed Sep 12 08:58:50 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 12 Sep 2001 08:58:50 +0100
Subject: [Doc-SIG] Re: "docutils"
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B001@UKRUX002.rundc.uk.origin-it.com>

From: David Goodger [mailto:goodger@users.sourceforge.net]
> How about "dps.modes.pythondocstring" or just 
> "dps.modes.docstring"

One point - while I'm in favour of descriptive names, I'd avoid making them
too long. dps.modes.pythondocstring is a bit too long for my taste. For that
matter, I find dps.parsers.restructuredtext too long, as well (if you use
from...import it's OK, but if you prefer fully qualified names it's a pain).
I'd prefer dps.parsers.rest (or rst, I'll keep out of that fight...) But
it's possibly too late for that.

Paul.


From tony@lsl.co.uk  Wed Sep 12 10:22:45 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 12 Sep 2001 10:22:45 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <B7C435CC.17667%goodger@users.sourceforge.net>
Message-ID: <00c201c13b6c$7b816970$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> > He's quite sensible, you know...)
>
> Why, thank you!

It's nothing. It was even meant as a compliment (we need some sense
around here sometimes!)

> My background includes 2.5 years of intense SGML work, analyzing
> documents, writing DTDs, implementing processing systems, etc. So the
> terminology comes naturally to me too.

Whilst understanding that that may have been hard graft, and not
necessarily pleasurable at times, *envy* - I've been "watching" SGML and
related technologies for many years (I remember being told about GML at
university) and never had a chance to play with any of it properly. XML
is now becoming important in our field, but its still not *document*
processing (except in a very meta sense).

Mind you, now we *are* working with XML, I haven't had any time to play
with it in Python (or much at all programatically). Humph. I just spend
time referring other people to potentially useful things, and trying to
help create/criticise schemas.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Wed Sep 12 10:22:53 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 12 Sep 2001 10:22:53 +0100
Subject: [Doc-SIG] Re: "docutils"
In-Reply-To: <B7C44F3A.17671%goodger@users.sourceforge.net>
Message-ID: <00c301c13b6c$800dcf60$f05aa8c0@lslp7o.int.lsl.co.uk>

Me:
> > But it does mean I now don't know what to call "pydps"
> > (since "pydoc" is already taken).

David:
> How about "dps.modes.pythondocstring" or just
> "dps.modes.docstring"

But that doesn't address what to call the command line tool, and it also
doesn't take account of the fact that it's handling much more than the
docstrings...

> (do you think anyone will ever implement an Emacs-lisp
> docstring  mode? :-).

I sincerely hope so. (no smiley at all!)

> I think dps.modes is where much of it will go.

I'll probably (!) defer to yourself on packaging, later on - although it
*might* be that the "suck in a Python package/module and report on it"
stuff should actually be a different module than docutils (was DPS)
itself.

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Wed Sep 12 10:26:36 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 12 Sep 2001 10:26:36 +0100
Subject: [Doc-SIG] Re: DPS file extensions
In-Reply-To: <B7C432DB.17665%goodger@users.sourceforge.net>
Message-ID: <00c401c13b6d$05160dd0$f05aa8c0@lslp7o.int.lsl.co.uk>

I said:
> > Could I suggest that we coin a standard extension for
> > DPS/reST files?

David replied:
> I suppose that's possible, but I'd rather not. The reason I gave the
> files .txt extensions in the first place was for cross-platform
> compatibility. I don't need extensions on my home machine at all.

I assume on a Mac one doesn't use extensions, and adds them when
transferring to other systems (or interfacing with them).

But I wasn't addressing files on Macs - I was thinking about the systems
*I* use, since that's what I'm familiar with.

> The Mac treats metadata sensibly;

I've never used Mac (well, not to count), but have used RiscOS, VMS and
NeXTStep, as well as Unixes and (early on) a mainframe operating system
at Cambridge which didn't *really* have a directory structure as such -
files and libraries. So I have to be agnostic on how one should treat
metadata (there are obvious problems in having two "files" for every
file, but then VMS had problems with having many files not actually
start at the start of file, so to speak - but then, I *liked* the VMS
approach (about many things, in fact)).

On the other hand, I (me, speaking for myself) would indeed want a
different icon for reST text files on a Mac - and I *do* want such on my
other systems, which means that a different extension is a useful tool.

>  metadata embedded in the filename is a crude hack that MacOS
> avoids (or avoided; reportedly, MacOS X seems to have caved in
> to peer pressure).

Hmm - as I understand it it's following on from NeXT in many ways (which
means that applications are rather neat - a whole directory to stuff
things in!), which means that it's (in some sense) a "Unix". So it
presumably either has to use extensions, or "guess" the file type by
looking inside it.

Having owned a NeXTStation (the Bang and Olufsen approach to computers -
seriously neat - it was also *very* impressive how it assembled out of
the box and, well, just worked when switched on!) I must admit to
hankering after MacOS X a bit.

> But I work on different platforms and I'd much rather I
> (and you!) get a nice text file icon than a blank one.

I thought that nowadays this was seriously not a problem. On Debian the
installer can introduce something, but Unixes are dodgy on this sort of
thing (it's very dependent on how one is *looking* at the desktop and
icons - with KDE, fvwm2, TkDesk, whatever). On Windows it's a simple
(well, ish) matter of telling the system to use an icon and an
application - so yes, that may involve the registry, but so does
everything else!

> We'd all have to teach our systems that '.rest' means '.txt'. Most
> people won't, and I don't want to add Windows registry fiddling to the
> distutils scripts.

Hmm - I bet it happens for some purpose at some time. Anyway, I would
actually hope that docutils will become part of the standard library,
and then the Python installer can do it...

> reStructuredText is a *form* of plaintext, so the .txt extension is
> appropriate.

XML is a form of plaintext, config files are a form of plaintext, *lots*
of things are forms of plaintext. OK, I know I'm being awkward there,
but I seriously *do* want a way of telling, from outside the text file,
that the author intended that it be processable with DPS. And the
standard (as in "historical, on many platforms, over much time" - yes, I
know Macs and RiscOS are different) way of doing that is with an
extension.

Of course, I'm also used to working with various different transfer
formats, many of which are "sort of" text files - so I don't
particularly expect that I will always have icons or registered
applications for .sif, .iff, .ntf, .citf, and so on and so on - but the
use of an extension to indicate what the file *is* is still of absolute
priority. On extension-using filesystems like VMS and Unix (and
relatives) the extensions are for the *users* convenience in
discriminating amongst files. Who cares if the system knows what they
mean - it's another bit of convenience for the user.

I think that's the gist of what I'm trying to convey - on a Unix or NT
system, the extension conveys truly useful information, whether the
*system* recognises it or not. Discriminations like .latex, .tex, .log,
.toc, .config, .notes, .xml, .xsd, .dat, .err, .faq (some of those may
be familiar!). It's quite clear that *some* of those may be just text
files with a slightly special content. Some of them are also variations
on each other. But the extension lets you *know* that. Otherwise one is
reduced to putting that information in the filename itself, and that's
naff and also fails to establish a convention (it has to be something as
strictly regimented as a PEP for it to work).

I also think that if (although I doubt it) we ever have alternatives to
reST, it would be very awkward if you couldn't tell which file was done
with which system from outside (imagine if someone is maintaining STNG,
STClassic and reST documents! They're close enough at a quick glance at
the text to cause confusion.)

Sorry - this has gone on long enough. And I've probably overstated my
case.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Wed Sep 12 10:38:56 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 12 Sep 2001 10:38:56 +0100
Subject: [Doc-SIG] Re: "docutils"
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B001@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <00c501c13b6e$be038e70$f05aa8c0@lslp7o.int.lsl.co.uk>

Moore, Paul wrote:
> One point - while I'm in favour of descriptive names, I'd
> avoid making them too long.
...
> For that matter, I find dps.parsers.restructuredtext too
> long, as well (if you use from...import it's OK, but if
> you prefer fully qualified names it's a pain).

I agree with Paul that "restructuredtext" is too long - I *do* tend to
like fully qualified names. And it's a pain at the NT command line (less
of a pain on *most* of the Unix systems I use, with file name
completion).

> I'd prefer dps.parsers.rest (or rst, I'll keep out of that
> fight...)

Erm, amiable amusement rather than fight?

.. _fig
Actually, in the context of a module (particularly if DPS and reST
become subpackages of a docutils package) I couldn't give a fig whether
it is "rest" or "rst" - I like to describe the format *itself* as
"reST", but that's a different issue (I can cope with two different
abbreviations!). (well, tell a lie, I'd *prefer* "rest", but I'd not
fight for it.)

Could I respectfully suggest a radical restructuring, sooner rather than
later (since at the moment we have the number of users under close
control!):

* overall package "docutils"
* subpackage "docutils.dps"
* subpackage "docutils.rest" or "docutils.rst" (see fig_ above)

This *may* answer the question of where to put the "Python source code"
parsing stuff later on - it would naturally go into (for instance)
"docutils.py<some-mnemonic>" (pyreport? pyinfo? pyman? pysillywalk?)

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
.. "equal" really means "in some sense the same, but maybe not
.. the sense you were hoping for", or, more succinctly, "is
.. confused with". (Gordon McMillan, Python list, Apr 1998)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)

 But
> it's possibly too late for that.
>
> Paul.
>


From goodger@users.sourceforge.net  Thu Sep 13 03:00:36 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 12 Sep 2001 22:00:36 -0400
Subject: [Doc-SIG] Re: "docutils"
In-Reply-To: <00c501c13b6e$be038e70$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C58B83.17767%goodger@users.sourceforge.net>

[David]
> How about "dps.modes.pythondocstring" or just "dps.modes.docstring"

[Paul]
> One point - while I'm in favour of descriptive names, I'd avoid
> making them too long.

The end-user will never see these names, of course.

> I'd prefer dps.parsers.rest (or rst, I'll keep out of that fight...)

No fight, just a discussion with a certain amound of hyperbole and
emphasis thrown in for good measure. We're all quite civil here!
I usually read a message, mull it over for a while, and reread it
before drafting my reply. That tends to stop emotions from clouding
the issues.

> But it's possibly too late for that.

Not too late. Better now than once it's in the standard library.
("In my mind, I use the power of positive visualization...")

[Tony]
> But that doesn't address what to call the command line tool

Who knows... If we're successful, we might even subsume pydoc! At
least, pydoc could be made to utilize the DPS to process its
docstrings.

> and it also doesn't take account of the fact that it's handling much
> more than the docstrings...

How about 'pysource' for the 'extracting from Python source code and
manipulating into useful documentation' mode?

> although it *might* be that the 'suck in a Python package/module and
> report on it' stuff should actually be a different module than
> docutils (was DPS) itself.

A middleman module between compiler.py and DPS would be just fine by
me. But there's still the DPS end of the 'Python source code' mode to
be considered.

[Tony]
> Could I respectfully suggest a radical restructuring, sooner rather
> than later (since at the moment we have the number of users under
> close control!):
> 
> * overall package "docutils"
> * subpackage "docutils.dps"
> * subpackage "docutils.rest" or "docutils.rst"

You're after a flatter package structure. That loses some context
information though; 'dps.parsers.restructuredtext',
'dps.modes.pysource', and 'dps.formatters.html' are obviously
different things. We don't really need the namespace space provided
by nested packages; it would be easy enough to avoid duplicate names.

The advantage of nested packages is that they might obviate the need
for registering modules/subpackages. The 'languages' subpackage of
restructuredtext allows the import of a language module using a string
argument, like 'en'. There is no lookup table; just drop the code in
place and go. But if we flatten out the structure and ask for a 'WXYZ'
formatter, either it has to be registered or we run the risk of
importing the wrong type of code.

Definitely time for a new SourceForge project! At least we can reserve
the name and have it point to DPS & reStructuredText. Eventually,
reStructuredText will be subsumed into DPS (or distutils) anyway. Once
we have more of the pieces in place it will be easier to decide how to
arrange them.

> This *may* answer the question of where to put the "Python source
> code" parsing stuff later on - it would naturally go into (for
> instance) "docutils.py<some-mnemonic>" (pyreport? pyinfo? pyman?
> pysillywalk?)

I like 'pysource'.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Thu Sep 13 03:03:13 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 12 Sep 2001 22:03:13 -0400
Subject: [Doc-SIG] Re: DPS file extensions
In-Reply-To: <00c401c13b6d$05160dd0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C58C20.17768%goodger@users.sourceforge.net>

After reconsidering Tony's original question and the ensuing
discussion:

> Could I suggest that we coin a standard extension for DPS/reST files?

I withdraw my objections to using a filename extension for
reStructuredText files (be it .rst, .rest, .restxt, .rstxt, .rtxt,
.rst.txt, .rest.txt, whatever), but I am loathe to *require* such an
extension. As an individual preference, sure, go ahead, but I think
I'll adopt wait-and-see. I'd rather not change the extensions of the
files in */spec/, because people won't know what to make of them,
whereas with .txt its obvious on first sight.

In Python-source mode (see? more than just docstrings ;-),
__docformat__ contains the name of the markup language. Filename
extensions don't apply here.

When processing standalone plaintext files, the user ought to know
what s/he's got. A specific filename extension would be useful here,
but a pain to the casual Windows and Mac user (no double-clicking
to edit).

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Thu Sep 13 06:00:52 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 13 Sep 2001 01:00:52 -0400
Subject: [Doc-SIG] DPS DTDs
In-Reply-To: <00b301c13aa8$7ef5cfc0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C5B5C3.1778C%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> I'll try to explain my point of view, since I'm not sure I
> see yours(!).

Thanks, and 'back atcha'. Writing this intro after writing the bulk
below, I think we may simply be looking at this stuff from different
angles, seeing different silhouettes of the same thing.

Distilled, what I'm saying is:

    I see a fundamental difference between an object representing 'a
    module' and an object representing 'a module's documentation'. The
    trees of the different types of objects may resemble each other in
    shape at first, but the nature of the nodes is very different.
    
    The tree resulting from the analysis of Python source (the 'parse
    tree') is specific to the 'Python source' input mode of the DPS,
    and will not be seen outside of this context. Therefore there's no
    reason to codify the schema in the generic DTD; outside of the
    PySource input mode, it isn't useful. This parse tree's schema
    certainly should be documented, possibly as a DTD, but separately
    from the document tree DTD.

Please relax and enjoy this message, safe in the knowledge that it's
just idle discussion. Please don't let me stop you from doing your
thing in your own way. I'm sure it will be useful no matter how things
end up.

> To me it doesn't make sense to have an artificial boundary at any
> level - so a docstring that is not parsed would be <literal> text
> (is that the right tag?),

'literal_block' actually. 'literal' is the inline element.

If you represent docstrings this way, how will you distinguish real
literal_blocks from unparsed raw docstrings?

> Now, in ``dps/specs`` you have ``ppdi.dtd``, which certainly
> *looks* to me as if it is doing the same job as I am doing with my
> <py_xxx> tags - that is, extending the DPS nodes tree "outwards"
> from the docstring into the Python code.

ppdi.dtd is not meant to extend the DPS nodes tree outwards into the
Python code, but to provide specialized elements useful for
*documenting* Python code. It's a subtle distinction but important
IMO. Let's take a simple example::

    # module 'example.py'
    a = 1
    """alphas"""
    b = 2
    """betas"""
    def f(n):
        """Return the f of `n`."""
        return some_expression_involving(n)
    class A:
        """A classy class."""
        def __init__(self):
        """Set up instance attributes."""
            self.count = 0
            """Keep track of things."""

The parse tree might end up looking like this (using indentation to
show structure)::

    [module]
        [name 'example']
        [attribute]
            [name 'a']
            [value 1]
            [docstring """alphas"""]
        [attribute]
            [name 'b']
            [value 2]
            [docstring """betas"""]
        [function]
            [name 'f']
            [parameters]
                [parameter]
                    [name 'n']
            [docstring """Return the f of `n`."""]
        [class]
            [name 'A']
            [docstring """A classy class."""]
            [method]
                [name '__init__']
                [parameters]
                    [self-parameter]
                        [name 'self']
                [docstring """Set up instance attributes."""]
                [attribute]
                    [name 'count']
                    [value 0]
                    [docstring """Keep track of things."""]

This parse tree gets transformed into the following document tree
(again using indentation, so we can omit many end-tags)::

    <document>
        <title>Module <module>example.py</>
        <section>
            <title>Module Attributes
            <module_attribute_section>
                <module_attribute>a
                <initial_value>1
                <paragraph>alphas
            <module_attribute_section>
                <module_attribute>b
                <initial_value>2
                <paragraph>betas
        <section>
            <title>Functions
            <function_section>
                <function>f
                <parameter_list>
                    <parameter_item>
                        <parameter>n
                <paragraph>Return the f of <parameter>n</>.
    etc.

None of the parse tree objects survive intact to the document tree.

The parse tree objects allow us to group together the appropriate
docstrings, and give us further Python-specific information. That
information is then transformed into a DPS nodes tree. If you think
of the original docstrings on the parse tree as 'fruit', then the
collation process is like the fruit growing into trees of their own,
getting nutrients (stuff like attribute names and default values)
from the 'roots'. Think of the roots as the parse tree upside-down.
The trunk of the doc tree meets the top of the parse tree; the
parse tree nourishes and generates the doc tree.

Kinda cool analogy!

(The tree above is just my preliminary idea of what the final DPS
tree should look like for a Python module. For instance, the
'<section><title>Module <module>xxx' could easily become
'<module_section><module>xxx'. In the end, these specialized elements
may disappear, leaving generic sections and titles in their wake.)

(Hmm. Since the .pformat() of DPS trees uses indentation also, we
could omit the end-tags. Would shorten the test data considerably, and
reduce confusion with XML, which is good. I like this. Implementing
it... now.)

> My point was simply that I am not, particularly, following that DTD -
> but I obviouslY (trivially, since I can output XML!) am following
> *some* (virtual) DTD. And it would be nice to write that down (be it
> as DTD or XML Schema or whatever) at some point.

Any tree-shaped data structure (among others) can be represented in
XML and therefore be indicated by a DTD.

Sure, write it down, even as a DTD if you like, but I don't see it
going into the existing DTD in dps/spec/, because it's not general
enough. It's internal documentation for the pysource mode. *That's*
the point I was trying to make that started this discussion.

Perhaps it's just a question of degree. I'm seeing the tree closer to
the final generic document representation, you're seeing it closer to
the original parse tree. Sound about right?

> (hmm - and I just realised *why* - if the two components (inner and
> outer, for want of better term) are *discontinuous* in structure,
> then it makes it harder to write a Formatter/Writer - it would need
> to know about the Python bits and the docstring bits independently

I don't think the output formatter should ever see any evidence of the
parse tree. (I must explain that I'm seriously considering a fourth
component, the 'style' for lack of a better term, that takes the
output of the input mode and parser and transforms it into the final
doc tree. The input mode and output style may require more than what
dps.nodes provides. The output styles for an input mode may be so
tightly coupled as to be specific to that input mode.) By the time
the doc tree gets to the formatter, it's a simple 'take this
dps.nodes doc tree structure and change it to your native format'. No
serious transformations involved.

Not having even *begun* to implement any of this, I don't know if
this idea is reasonable or feasible.

> > The two types of tree represent fundamentally different
> > information.
> 
> I see we disagree - it's all document (erm - serialisation of
> information).

Yeah, but serialisation of Python code vs. serialisation of *document*
of Python code.

> I really think we might be talking past each other,

Probably :-)

> because what I'm doing is so simple and obvious that I find it
> hard to call it hypergeneralisation - I'm not losing anything, and
> I'm gaining quite a lot.
>
> I'm using the compiler parse tree to hold the parse tree, and
> generating documenation (as part of a DPS node tree) from it. That's
> obvious to me. Are we just confusing each other with words?

Could be!

> > In dps/spec/ppdi.dtd you'll see the "Additional Structural
> > Elements"...
> 
> They are clearly a start on holding the information one needs
> to report on in a document. They didn't do enough for me, which is
> why I'm not using them.

Fair enough. It seems to me you're representing an intermediate between the
parse tree and the final document tree. I just don't see the need.

> And you had some, erm, interesting function definitions.

Oh, I see what you mean, ones like this? ::

    def standalone_uri(self, text, lineno, pattern=inline.patterns.uri,
                       whole=inline.groups.uri.whole,
                       email=inline.groups.uri.email):

I was using the 'Stuff' class to hierarchically group related
constants without polluting the namespace. This 'Stuff'
dotted-attribute collection idiom is useful and, I think, successful.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Thu Sep 13 11:01:44 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Thu, 13 Sep 2001 11:01:44 +0100
Subject: [Doc-SIG] Re: DPS file extensions
In-Reply-To: <B7C58C20.17768%goodger@users.sourceforge.net>
Message-ID: <00d601c13c3b$1828ab20$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> After reconsidering Tony's original question and the ensuing
> discussion:
>
> > Could I suggest that we coin a standard extension for
> > DPS/reST files?
>
> I withdraw my objections to using a filename extension for
> reStructuredText files (be it .rst, .rest, .restxt, .rstxt,
> .rtxt, .rst.txt, .rest.txt, whatever), but I am loathe to
> *require* such an extension.

I agree that we shouldn't *require* an extension - sorry if it came over
as if I was being *that* awkward. I would rather like to have an
"extension you use by convention if you're going to use it" in the
documentation - but your list has several interesting possibilities
(although not the ones with two dots, please).

Of those given, I think the ones I lean towards most would be:

.rtxt    -- short, looks like .txt a bit
.restxt  -- a little long, but explicit
.rest    -- probably too unobvious

Hmm - so maybe the first with a side-option of the second...

> I'd rather not change the extensions of the files in */spec/,
> because people won't know what to make of them, whereas with
> .txt its obvious on first sight.

It's your directory, so (by the decision above) your decision. I
(myself, me) would still err on the side of introducing an extension by
example (since the directory is called "docs", etc., etc.), but this is
the sort of discussion to have over a drink (tea, coffee, cider,
spatlese, whatever) and we're on different sides of an ocean, so it's
probably better left for now!

> In Python-source mode (see? more than just docstrings ;-),
> __docformat__ contains the name of the markup language. Filename
> extensions don't apply here.

Sorry, me leaving out "obvious" stuff again - one of my "supporting
issues" was that in Python source code we already have established a way
of doing this job - but I didn't mention it since I'd already written
enough stuff, and I wasn't sure I could make it make sense.

> When processing standalone plaintext files, the user ought to know
> what s/he's got. A specific filename extension would be useful here,
> but a pain to the casual Windows and Mac user (no double-clicking
> to edit).

Extensions, as we've said, don't apply on Macs, so don't use them (but
isn't it a norm to register new filetypes on Macs so the correct icon
comes up?)

Windows actually makes it very easy to associate a file extension and an
icon - it can be done via the file Explorer Options/File Types
interface. And choosing how to *edit* an unknown extension is a matter
of telling the system once, and asking it to remember it, *if* it hasn't
been set up by the installer (difficult, as what one person uses to edit
text files may not be another person's preference).

And the same concerns *do* apply to a "modern" Unix-oid system, as well,
in these days of Gnome, KDE and other such things.

Anyway, too much text written by me on too little issue...

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Thu Sep 13 11:01:40 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Thu, 13 Sep 2001 11:01:40 +0100
Subject: [Doc-SIG] DPS DTDs
In-Reply-To: <B7C5B5C3.1778C%goodger@users.sourceforge.net>
Message-ID: <00d501c13c3b$15f4db30$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Writing this intro after writing the bulk below, I think we may
> simply be looking at this stuff from different angles, seeing
> different silhouettes of the same thing.

I thought so before, and I'm fairly sure so now!


>  I see a fundamental difference between an object representing 'a
>  module' and an object representing 'a module's documentation'. The
>  trees of the different types of objects may resemble each other in
>  shape at first, but the nature of the nodes is very different.

Yes, definitely - that's a good paragraph.

>  The tree resulting from the analysis of Python source (the 'parse
>  tree') is specific to the 'Python source' input mode of the DPS,
>  and will not be seen outside of this context.

Definitely.

At which point, and having read the rest of the email, I *think* I see
what's going on...

Since I'm developing in small chunks of time, and also because I'm
designing as I go along (easy in Python), the tree structure that I
create using DPS nodes is being evolved over time. Because I'm not
always very elegant at naming, all of the elements I'm introducing are
being called "py_xxx" (that is, those are their tagnames - not the same
as their class names).

I haven't considered whether *some* of the "py_xxx" elements are
actually identical (or sufficiently close with modification on one side
or the other) to elements that the DPS nodes module already defines. To
an extent, I don't care - that (for me) would be hypergeneralisation *at
this stage*, particularly since I'm still prepared to readically change
the "outer" tree structure if necessary.

To an extent, I'd address the difference between parse tree and document
tree as that the parse tree allows one to reconstruct the Python code
(more or less!), and this is *not* a need of the documentation tree -
indeed, one may alter the order or structure of the latter to facilitate
documentation purposes and obscure things that the parse tree would
consider important.

Because the requirement is that the a DPS node tree be emitted so that
different Formatters (Writers) can use it, one *does* want a DPS node
tree (trivial point, but worth saying).

Because one is dealing with documenting Python, there is *likely* to be
Python-specific stuff in the "outer" parts of the tree (i.e., those bits
that are not inside docstrings).

The alternative would, of course, be to produce a DPS/reST document
*describing* the Python from scratch, and that's an alternative I hadn't
(directly) thought of - for instance::

    <section>
      <title>Python module Fred
      <section>
         <title>Globals
         <ordered_list>
            <item>fred

instead of::

    <py_module name="Fred">
       <py_globals>
          <py_global name="fred>

(the tagnames for both are wrong, but you get the point), but on the
whole I prefer the latter *at this stage* (it's easier to postprocess as
well, if one wants to (for instance) remove methods that start with an
underscore, and want to postpone that decision as late as possible - it
makes sense to me that this might be the sort of thing one wants to
customise in the document tree).

The structure above is what I'm talking about when I talk about
extending DTDs - it *isn't* the parse tree, although I guess it's close
in some ways.

> Please relax and enjoy this message, safe in the knowledge that it's
> just idle discussion. Please don't let me stop you from doing your
> thing in your own way. I'm sure it will be useful no matter how things
> end up.

!!!

I see what I'm doing as prototyping. It would be nice (very nice) if
elements of it (even large chunks of it!) end up in the final product,
but that's not the main point - the main point is to demonstrate that
one *can* do things (always more satisfactory than handwaving), and to
have a reference point to push against (for instance, "ugh, that's
horrible, I can improve on that" - a valuable response).

> If you represent docstrings this way, how will you distinguish real
> literal_blocks from unparsed raw docstrings?

Damn - I hadn't thought of that.

Actually, a simple answer would be::

    <py_docstring parsed="1">
       <literal_block>

but it's undoubtedly better to do::

    <py_docstring format="reST">
       <literal_block>

since we *have* the dosctring format "name" around (even if implicitly)
in the Python code. Actually, that last is probably an essential thing
to do.

> ppdi.dtd is not meant to extend the DPS nodes tree outwards into the
> Python code, but to provide specialized elements useful for
> *documenting* Python code. It's a subtle distinction but important
> IMO.

Actually, I think it shows exactly the point I've been misexplaining -
what I *want* to say is what you're suggesting I should, I think.

> Let's take a simple example::
...OK...
> The parse tree might end up looking like this (using indentation to
> show structure)::
...OK...
> This parse tree gets transformed into the following document tree
> (again using indentation, so we can omit many end-tags)::
>
>     <document>
>         <title>Module <module>example.py</>
>         <section>
>             <title>Module Attributes
>             <module_attribute_section>
>                 <module_attribute>a
..etc..

Not entirely dissimilar to what I'm actually doing, although details
differ quite a lot.

> None of the parse tree objects survive intact to the document tree.

No, I never wanted to suggest that. Much of the *information* does,
though!

..analogy snipped..

> (The tree above is just my preliminary idea of what the final DPS
> tree should look like for a Python module. For instance, the
> '<section><title>Module <module>xxx' could easily become
> '<module_section><module>xxx'. In the end, these specialized elements
> may disappear, leaving generic sections and titles in their wake.)

Which is sort of what I realised earlier in this reply,

> (Hmm. Since the .pformat() of DPS trees uses indentation also, we
> could omit the end-tags. Would shorten the test data considerably, and
> reduce confusion with XML, which is good. I like this. Implementing
> it... now.)

Indentation is good, end tags are verbose - I agree!

> Perhaps it's just a question of degree. I'm seeing the tree closer to
> the final generic document representation, you're seeing it closer to
> the original parse tree. Sound about right?

Sort of - I think I would say that, at the moment, I'm seeing value in
document elements that represent Python elements more directly (the same
sort of value as having a <booktitle> term (e.g., <booktitle>Jim</>) in
a document about libraries, instead of turning it into the
"standardised" representation 'JIM') - it's *useful* to be able to talk
about a Python module or class *in the document space*.

(ah - that's the insight/comparison I've been striving for - in the same
way that in TeX I prefer to define \book{title} rather than use (e.g.)
{\sc title}, even though in the final output they may *look* the same)

> I must explain that I'm seriously considering a fourth
> component, the 'style' for lack of a better term, that takes the
> output of the input mode and parser and transforms it into the final
> doc tree. The input mode and output style may require more than what
> dps.nodes provides. The output styles for an input mode may be so
> tightly coupled as to be specific to that input mode.

Hmm - so that sounds like the interface that changes my <py_module>
based tree into a "standardised" <section><title> tree - is that right?
(xslt for DPS nodes!)

> > And you had some, erm, interesting function definitions.
>
> Oh, I see what you mean, ones like this? ::
>
>     def standalone_uri(self, text, lineno,
>                        pattern=inline.patterns.uri,
>                        whole=inline.groups.uri.whole,
>                        email=inline.groups.uri.email):

Yep. Perfectly good Python code (if a bit confusing on first sight!),
but it showed me some representation I wasn't handling.

Tibs (trying to agree furiously)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Thu Sep 13 11:01:58 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Thu, 13 Sep 2001 11:01:58 +0100
Subject: [Doc-SIG] Re: "docutils"
In-Reply-To: <B7C58B83.17767%goodger@users.sourceforge.net>
Message-ID: <00d701c13c3b$206d1000$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Who knows... If we're successful, we might even subsume pydoc! At
> least, pydoc could be made to utilize the DPS to process its
> docstrings.

I was carefully not thinking/saying that! Also, pydoc is a much "lighter
weight" solution (at least as it stood last time I looked), and thus is
valuable for that alone (by "lighter weight" I simply mean that it
doesn't have as many dependencies, it's much smaller/more
self-contained).

> > * overall package "docutils"
> > * subpackage "docutils.dps"
> > * subpackage "docutils.rest" or "docutils.rst"
>
> You're after a flatter package structure. That loses some context
> information though; 'dps.parsers.restructuredtext',
> 'dps.modes.pysource', and 'dps.formatters.html' are obviously
> different things. We don't really need the namespace space provided
> by nested packages; it would be easy enough to avoid duplicate names.

Hmm - I *meant* to be advocating simply taking what is now "dps" and
renaming it to "docutils.dps", and similarly for "restructuredtext" -
which to my mind gives us a *deeper* package structure (albeit by one
level!).

So one would refer to "docutils.dps.parsers.rest" (for instance) if one
is only doing minimal change (well, the external user might not - but
that's why we have __init__.py files).

(I'd assumed a Grand Plan for the current package structures, so didn't
want to contemplate changing them!)

Extra parsers (and I agree it's good to keep that door open, even if we
never use it) could then either slot into docutils directly - so we
might (!) have "docutils.pod" - or would be entirely separate and
require "conscious" registration.

> Definitely time for a new SourceForge project!

Oh, I can tell you're enjoying this!

> At least we can reserve the name and have it point to DPS
> & reStructuredText.

Yes, I think that's sensible.

> Once we have more of the pieces in place it will be easier to
> decide how to arrange them.

Yes - I wasn't expecting instant answers to the suggestion...

> > (pyreport? pyinfo? pyman? pysillywalk?)
>
> I like 'pysource'.

OK - let's tentatively go for that. I *may* hold off on renaming what
I'm working on for the moment, though, until it's a bit mature (not that
I've worked on it this week, ho hum).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From goodger@users.sourceforge.net  Thu Sep 13 22:58:31 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 13 Sep 2001 17:58:31 -0400
Subject: [Doc-SIG] Re: DPS file extensions
In-Reply-To: <00d601c13c3b$1828ab20$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7C6A445.17837%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> Extensions, as we've said, don't apply on Macs, so don't use them
> (but isn't it a norm to register new filetypes on Macs so the
> correct icon comes up?)

There actually is a mechanism on MacOS for cross-platform
compatibility. You associate a filename extension with an existing
application & filetype. It's similar to what you do on Windows. The
default set includes .txt, .html, and a few dozen others.

The point is, though, that dealing with this automatically is a royal
pain, and asking people to do it manually won't fly (*I* wouldn't do
it if asked!). So I'd rather not. Maybe later, but not now.

If somebody were to cook up bulletproof multi-platform (means
Win/Mac/*n*x, not just Win/*n*x) code for this, I would be happy to
add it to the project(s). Maybe there's already support for it in
distutils? But the code would have to be *really* bulletproof.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Fri Sep 14 10:01:29 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Fri, 14 Sep 2001 10:01:29 +0100
Subject: [Doc-SIG] Re: DPS file extensions
In-Reply-To: <B7C6A445.17837%goodger@users.sourceforge.net>
Message-ID: <001c01c13cfb$d7ae8080$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> There actually is a mechanism on MacOS for cross-platform
> compatibility. You associate a filename extension with an existing
> application & filetype. It's similar to what you do on Windows.

Well, on Windows its much more optional, since there *are* file types,
and they do still matter (since that's what the registration, if any,
keys off).

> The point is, though, that dealing with this automatically is a royal
> pain, and asking people to do it manually won't fly (*I* wouldn't do
> it if asked!). So I'd rather not. Maybe later, but not now.

Sounds like it should indeed be a definite "leave until later".

> If somebody were to cook up bulletproof multi-platform (means
> Win/Mac/*n*x, not just Win/*n*x) code for this, I would be happy to
> add it to the project(s). Maybe there's already support for it in
> distutils? But the code would have to be *really* bulletproof.

[and that will mean Mac "classic" OS and the new Mac OS as well - I bet
they'll do things differently - and I'd want to hold some support out to
Linux window managers as well - so it *is* a biggish task.]

I half-watch the distutils list, and I don't remember anything there
about registering file types at all. However, it *does* sound like the
sort of thing that distutils *should* be able to do (at least to me). Of
course, in the classic way, volunteers need to be around to do it, and
this actually sounds like it may be something to address after docutils
is finished (!) if noone else has done it. I agree it is *not* urgent!

[in fact, so not urgent that I haven't even bothered to mention it on
the distutils SIG, mainly since we're not volunteering said effort...]

Interestingly, today's message on the distutils SIG is about compiling
on Macs...

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From usc@ieee.org  Mon Sep 17 23:28:21 2001
From: usc@ieee.org (Ueli Schl�pfer)
Date: 18 Sep 2001 00:28:21 +0200
Subject: [Doc-SIG] Producing output
In-Reply-To: "Tony J Ibbs's message of "Mon, 10 Sep 2001 14:37:25 +0100"
References: <00aa01c139fd$b9fb0730$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <m2n13t4flm.fsf@hobbes.dyn.dhs.org>

Tibs, Paul,

Recently, I started writing short notes in the reST format because I
wanted to get a feeling for how well it works and what it's good for.
Excellent job -- thanks (this obviously includes David Goodger!!)

Up to now, all that was missing was the processing.  Now I've got some
time to spare and started playing around with the DPS.  I found that I
had to make the following modifications to the latest DPS snapshot
from CVS in order to get html output from plain text files:

- `dps2html` wants an `Errorist` class, whoch I found nowhere and
  assumed that David's `Reporter` would do fine.
- David has used `bibliographic_labels` in his language files,
  whereas, in restructuredtext/restructuredtext/states.py, it's
  `bibliographic_fields`.

In pydps/html.py, there are several references to `element.tagName`
which should probably read `element.tagname`.

The diffs_ are included below.

Ueli


.. _diffs:

---8<------------------------------------------------------------
Index: dps/dps/utils.py
===================================================================
RCS file: /cvsroot/docstring/dps/dps/utils.py,v
retrieving revision 1.6
diff -u -r1.6 utils.py
--- dps/dps/utils.py	2001/09/17 03:54:29	1.6
+++ dps/dps/utils.py	2001/09/17 21:57:10
@@ -1,3 +1,4 @@
+
 #! /usr/bin/env python
 
 """
@@ -66,3 +67,5 @@
         if sourcetext:
             children.append(nodes.literal_block('', sourcetext))
         return self.system_warning(3, children=children)
+
+class Errorist(Reporter): pass
---8<------------------------------------------------------------

---8<------------------------------------------------------------
Index: restructuredtext/restructuredtext/states.py
===================================================================
RCS file: /cvsroot/structuredtext/restructuredtext/restructuredtext/states.py,v
retrieving revision 1.20
diff -u -r1.20 states.py
--- restructuredtext/restructuredtext/states.py	2001/09/17 04:24:20	1.20
+++ restructuredtext/restructuredtext/states.py	2001/09/17 21:57:19
@@ -210,7 +210,7 @@
     def extractbibliographic(self, field_list, title):
         nodelist = []
         remainder = []
-        bibliofields = self.language.bibliographic_fields
+        bibliofields = self.language.bibliographic_labels
         abstract = None
         for field in field_list:
             try:
---8<------------------------------------------------------------

---8<------------------------------------------------------------
--- pydps/html.py	Mon Sep 17 22:57:43 2001
+++ pydps.orig/html.py	Sun Sep  9 20:32:50 2001
@@ -132,7 +132,7 @@
 
         # Hmm - have we been handed a "document" rooted tree,
         # or a DOM-like tree that has "document" as its single child?
-        if document.tagname == "document":
+        if document.tagName == "document":
             self.write_html(document,stream)
         else:
             for element in document:
@@ -187,10 +187,10 @@
         """Write out the HTML representation of `element` on `stream`.
         """
 
-        if element.tagname == "#text":
+        if element.tagName == "#text":
             stream.write(self.escape(element.astext()))
-        elif self.indirect.has_key(element.tagname):
-            value = self.indirect[element.tagname]
+        elif self.indirect.has_key(element.tagName):
+            value = self.indirect[element.tagName]
             if value is None:
                 # Nothing to do with this element - but check its children
                 for node in element:
@@ -215,7 +215,7 @@
         """Write out an element which we don't recognise.
         """
 
-        stream.write("\n<p><font color='red'>&lt;%s"%element.tagname)
+        stream.write("\n<p><font color='red'>&lt;%s"%element.tagName)
         for name,value in element.attlist():
             stream.write(" %s='%s'"%(name,self.escape(value)))
         stream.write("&gt;</font>\n")
@@ -224,7 +224,7 @@
             self.write_html(node,stream)
 
         stream.write("\n<font color='red'>"
-                     "&lt;/%s&gt;</font>\n"%element.tagname)
+                     "&lt;/%s&gt;</font>\n"%element.tagName)
 
     def write_section(self,element,stream):
         """Write a section - i.e., something with a title
@@ -404,7 +404,7 @@
                 "system_warning"  : write_warning,
                 }
     """Entries in this dictionary are all keyed by a DPS element's
-    tagname. The values are either:
+    tagName. The values are either:
 
     * a simple string, representing the HTML tag to use for this
       element
@@ -556,7 +556,7 @@
         stream.write("<li><samp>%s</samp>\n"%element["name"])
         if len(element) > 0:
             for node in element:
-                if node.tagname == "py_docstring":
+                if node.tagName == "py_docstring":
                     self.write_docstring(node,stream)
                 else:
                     self.write_html(node,stream)
---8<------------------------------------------------------------


From goodger@users.sourceforge.net  Tue Sep 18 01:31:06 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Mon, 17 Sep 2001 20:31:06 -0400
Subject: [Doc-SIG] Producing output
In-Reply-To: <m2n13t4flm.fsf@hobbes.dyn.dhs.org>
Message-ID: <B7CC0E09.17A4F%goodger@users.sourceforge.net>

Hi Ueli,

> Excellent job -- thanks (this obviously includes David Goodger!!)

You're welcome!

> - `dps2html` wants an `Errorist` class, whoch I found nowhere and
>   assumed that David's `Reporter` would do fine.

You're the victim of rapid change. I just checked this change in yesterday;
`Reporter` is the new name. Tony hasn't had a chance to update his code yet.

What is `dps2html`? I haven't seen that one. Perhaps an old copy?

> - David has used `bibliographic_labels` in his language files,
>   whereas, in restructuredtext/restructuredtext/states.py, it's
>   `bibliographic_fields`.

`bibliographic_fields` in dps/parsers/restructuredtext/languages/en.py is
for parsing bibliographic field lists. `bibliographic_labels` in
dps/languages/en.py is not used yet, and may move; it was intended for
output writers to generate labels appropriate for tags. Basically the two
have opposite meanings and uses. `bibliographic_fields` is for converting
``:Author: Kilgore Trout`` to ``<author>Kilgore Trout</author>``.
`bibliographic_labels` is for converting ``<author>Kilgore Trout</author>``
back to ``Author: Kilgore Trout`` or ``Kilgore Trout, Author`` or something
like that.

Inadequately documented, yes.

> In pydps/html.py, there are several references to `element.tagName`
> which should probably read `element.tagname`.

You must be looking at an old version. Tony's latest has been updated to
this particular change.

You must realize that these projects are being updated almost daily. Many
aspects of the APIs have not settled yet and are subject to change. If
you're interested (and I hope you are!), please consider subscribing to the
specific mailing lists:

- reStructuredText development:
  http://lists.sourceforge.net/lists/listinfo/structuredtext-develop
- reStructuredText CVS checkins:
  http://lists.sourceforge.net/lists/listinfo/structuredtext-checkins
- DPS development:
  http://lists.sourceforge.net/lists/listinfo/docstring-develop
- DPS CVS checkins:
  http://lists.sourceforge.net/lists/listinfo/docstring-checkins

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Tue Sep 18 10:15:04 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 18 Sep 2001 10:15:04 +0100
Subject: [Doc-SIG] Producing output
In-Reply-To: <B7CC0E09.17A4F%goodger@users.sourceforge.net>
Message-ID: <002e01c14022$675c1c00$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger replied to Ueli Schl�pfer:
> > - `dps2html` wants an `Errorist` class, whoch I found nowhere and
> >   assumed that David's `Reporter` would do fine.
>
> You're the victim of rapid change. I just checked this change
> in yesterday; `Reporter` is the new name. Tony hasn't had a
> chance to update his code yet.

Well, actually I changed it, but I hadn't announced it.

> What is `dps2html`? I haven't seen that one. Perhaps an old copy?

I think he's just using it as a generic term. Certainly the diffs are
with respect to an old version of pydps/html.py.

Anyway, a status report on this for Doc-SIG might not be a bad thing...

I've been working on a package to read Python modules/packages and
produce HTML therefrom. It uses Tools/compilers and dps/reST to do the
hard work. It is in a *very* alpha stage (it doesn't render lots of
things as HTML, just outputs tags in red for them, for instance, and it
definitely doesn't know everything it should about the Python code yet -
especially, it doesn't check the presence or value of ``__docformat__``,
which is naughty of it! Oh, and it has a very garish colour scheme for
the HTML(!)).

Copies are periodically uploaded as:

	http://www.tibsnjoan.co.uk/reST/pydps.tgz

but see the DPS/reST "table of links" on

	http://www.tibsnjoan.co.uk/

for a reliable pointer, as a name change is possible in the reasonably
near future (it may be becoming "pysource").

The latest version (yesterday's) does indeed now know about Reporter
rather than Errorist. Despite being aimed at Python code, it can process
a text DPS file - for instance::

    python pydps/pydps.py --text textfile.rst textfile.html

(for those following these things, significant changes are that the
command line interface now uses getopt (yuck), and I think I've now got
it representing all expression nodes, so it should be able to cope with
any function arguments that get thrown at it.)

If you're trying to learn about what DPS/reST *does*, then the command::

    python pydps/pydps.py --text textfile.rtxt

produces rather nice output (although "quicktest.py" can produce the
same), and if you want to add in the structure of the (current) Python
information (subject to vast change at zero notice, though - I'm afraid
on a project like this, the ability to refactor (gosh, I used to just
call that "change"!) Python code quickly and safely is very useful),
try::

    python pydps/pydps.py --pretty pythonfile.py

There's zero documentation (well, there's a bit if you run it on
itself!) other than the ``--help`` command. It requires one to have
installed the latest versions of dps and restructuredtext, and the
Tools/compiler for your Python (trying to use the wrong one may lead to
odd effects) - all neat things to have around anyway (!). It is not
expected to work on Python before 2.0.

> > In pydps/html.py, there are several references to `element.tagName`
> > which should probably read `element.tagname`.
>
> You must be looking at an old version. Tony's latest has been
> updated to this particular change.

Yes, days ago (maybe even a week or more).

> If you're interested (and I hope you are!),

I'll second that - the more people playing (and *using* counts as
playing for this game!) the better.

Note that I've been announcing significant changes in pydps over on the
docstring-develop list:

	http://lists.sourceforge.net/lists/listinfo/docstring-develop

rather than on the main Doc-SIG (when I remember), partly because the
changes are too frequent (and unfortunately I'm not integrated with the
CVS tree).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From goodger@users.sourceforge.net  Tue Sep 18 22:44:19 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 18 Sep 2001 17:44:19 -0400
Subject: [Doc-SIG] DPS components
Message-ID: <B7CD3872.17A60%goodger@users.sourceforge.net>

I like Tony's name for the outputter: "writer". Generalizing, it fits
my idea of separation of components (see "Modes and Styles" in
http://docstring.sourceforge.net/spec/dps-notes.txt).

We can have "readers" (what I'd called "input modes") that understand
the data source (e.g. "pysource" for Python source files, "pep" for
PEPs, one or more for standalone .rtxt files, "email" is possible,
etc.). Readers understand where the data is coming from, send discrete
"chunks" to the parser, and provide the context to bind the chunks
together into a cohesive whole. Readers also resolve all hyperlinks,
footnote numbers and links, interpreted text links, and anything else
that requires context-sensitive computation.

We have "parsers" for the syntax itself (just "reStructuredText" a.k.a.
"rst" a.k.a. "reST" a.k.a. "rtxt" for now). Parsers don't know or care
anything about the source or destination of the data; they just
analyze their input and produce conformant output.

I'm calling the next set of components "designers" (previously called
"syles"; "designers" is OK until a better term is found; "stylists"?
[#]_). Designers take the output from a reader and transform it.
Content transformations (moving stuff around, grouping, separating) as
well as cosmetic transformations (?) happen here. The output from a
designer is the input of the writer.

"Writers" (formerly "formatters") produce the final output (HTML, XML,
TeX, etc.). Writers merely do translations from one data format to
another; they don't do any content transformations. Their input may be
an augmented form of the current schema, with color and layout
information added.

It appears to me that there will be strong links between readers and
designers, whereas parsers and writers are more independent and
interchangeable.

Opinions? Comments?

.. [#] You may be able to tell that names for things are very
   important to me when designing a system. When all the pieces of the
   system have the right names, it feels right, and everything falls
   into place. I spend a lot of time searching for the right names. My
   thesaurus and dictionary are prominent in my technical library.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Wed Sep 19 10:32:06 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 19 Sep 2001 10:32:06 +0100
Subject: [Doc-SIG] DPS components
In-Reply-To: <B7CD3872.17A60%goodger@users.sourceforge.net>
Message-ID: <004b01c140ed$f2805cd0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger coined the terms:
> "readers" (what I'd called "input modes") that understand
> the data source

Of course, the "reader" for plain .rst files is *ever so* simple!

By the way, the duality "input mode" -> "reader" is still a natural one,
I think, since I think the user would think about setting an input mode
more than choosing a reader, in many cases.

> "parsers" understand the syntax itself (just "reStructuredText"
> for now).

Hmm - *maybe* we need to be careful to say "reST parser" here, since
there may be other parsers hanging around as part of the "reader".

> "designers" take the output from a reader and transform it.
> Content transformations (moving stuff around, grouping,
> separating) as well as cosmetic transformations (?) happen
> here.

Actually, I think this is the interesting stage to have identified. I
agree it's difficult to think of a name for it.

> "Writers" (formerly "formatters") produce the final output (HTML, XML,
> TeX, etc.). Writers merely do translations from one data format to
> another; they don't do any content transformations. Their input may be
> an augmented form of the current schema, with color and layout
> information added.

Yes, I like these terms (well, the *particular* names aren't as
important to me, but the existence of the terms/concepts is a good
pedagogy, I think).

So I have a *reader* that uses Tools/compiler, I use the *parser* on the
docstrings I find, my *designer* phase takes all of that and produces
something I can pass to the HTML *writer*.

I assume that you are thus hinting that the output of the designer
should be "pure" DPS nodes - that is, using only DPS tree nodes that are
defined in dps/nodes.py, so that *any* DPS writer can be slotted in.

At the moment, my code doesn't work like that - the designer is
producing "extended" nodes, and the writer understands what to do with
them. I *think* that the "pure" approach is probably better, but am a
*little* concerned about how I would convey some of the distinctions
that DPS doesn't discern (and no, I don't have concrete examples yet),
but that are useful in a writer (e.g., I use lots of colours in my
current HTML output, but if I were printing it I would need to move back
to monochrome (it's not fair to assume access to colour printers!)).

> It appears to me that there will be strong links between readers and
> designers, whereas parsers and writers are more independent and
> interchangeable.

Yes, that sounds like an advantage.

The example I would use to think about the flow through the system would
be that of a simple table in the quick reference, which could use a
directive::

    .. quickreftable:: Directives
       :link: http://link-to-text
       ::

         For instance:

           .. graphic:: images/ball1.gif

We would need a plugin for the DPS parser (to understand the content).
The "example" literal text would need to be fed back to the parser
(presumably by the designer phase - neat) to generate the right hand
column of the table. Now, given I want the table header to be in pale
blue, with the word "Directives" in strong italics, and I want the table
body to be split 50/50 between the two columns, with a pale yellow
background, *if* I'm outputting to HTML - how do I do that? Bearing in
mind that if I'm outputting to PDF, I want an entirely different set of
details.

My *suspicion* is that we have three sets of plugins for a directive
(and maybe for other things, but directives are nice and obvious):

1. a plugin for the parser - this just enables it to be read
   "properly", and would be optional for simple directives
   (indeed, I think it's not needed for the above example).
2. a plugin for the designer - this, in my example, does the
   reparsing of the example, and again would be optional.
3. a plugin for the writer - again, only needed in *some*
   cases, but this would allow the user to set a "style"
   for the table or whatever.

Now, in the HTML world, case 3 might merely mean that we set the "name"
for a table, and have appropriate CSS defined to say how to display it.
In TeX we would assume that the name indicated a particular macro. It
*may* be that such a "style name" is all that we *do* need, and that the
writer just needs to be given a lookup table of style names versus
styles. Hmm...

Despite the witterings above, I'm not too concerned about this as yet -
I have the feeling that it will all come clear in the attempt to
implement a "clean" system[1]_.

Tibs

.. [1] Having done a little bit of Tai Chi, including some
   "pushing hands", I tend to think of trying to solve this
   sort of problem as akin to finding one's partner's balance
   point, the point where one pushes gently and they go flying
   off into the distance. And in pushing hands, that often
   involved "feeling" around their centre until one found it
   (or they found yours!). So if one keeps being flexible and
   feeling around for the centre of the problem, one will
   eventually cause it to "fall over" in the natural and, in
   hindsight, obvious manner.

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
.. "equal" really means "in some sense the same, but maybe not
.. the sense you were hoping for", or, more succinctly, "is
.. confused with". (Gordon McMillan, Python list, Apr 1998)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From garth@deadlybloodyserious.com  Wed Sep 19 11:56:06 2001
From: garth@deadlybloodyserious.com (Garth Kidd)
Date: Wed, 19 Sep 2001 20:56:06 +1000
Subject: [Doc-SIG] DPS components
References: <004b01c140ed$f2805cd0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <00ca01c140f9$b0289ad0$8301010a@beast2k>

Is there any problem I've missed that prevents us from spitting out plain
XML and using transforms to convert it to XHTML? :)  ::

    Reader-parser-[optional transformer]-writer

Transformers take a DPS tree and spit out another DPS tree, right?

Is the intent something along the lines of the following? ::

    Writer.write(Transformer.transform(Parser.parse(Reader())))


From fdrake@acm.org  Wed Sep 19 13:50:10 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Wed, 19 Sep 2001 08:50:10 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010919125010.D91A028845@beowolf.digicool.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Miscellaneous minor updates, including docs for the new codec interfaces.


From tony@lsl.co.uk  Wed Sep 19 13:55:55 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 19 Sep 2001 13:55:55 +0100
Subject: [Doc-SIG] DPS components
In-Reply-To: <00ca01c140f9$b0289ad0$8301010a@beast2k>
Message-ID: <005001c1410a$6b83e950$f05aa8c0@lslp7o.int.lsl.co.uk>

Garth Kidd wrote:
> Is there any problem I've missed that prevents us from
> spitting out plain XML and using transforms to convert
> it to XHTML? :)  ::

I would certainly imagine that would be possible, assuming (as I do)
that all of the DPS node tree information gets dumped as XML
elements/attributes. Of course, for most people it might not be a very
user-friendly mechanism...

>     Reader-parser-[optional transformer]-writer
>
> Transformers take a DPS tree and spit out another DPS
> tree, right?

Hmm - essentially (although I think we need to discriminate between
"pure" DPS tree (or "simple"?) which just uses the DPS defined nodes,
and "extended" DPS tree, which also uses application-specific nodes.

I *think* that the parser may be emitting an "extended" tree, but that
David's intent is that the input to the *writer* should be a "standard"
or "pure" tree. So one could define a transformer as the entity that
renders an extended DPS tree into a standard DPS tree - this makes it
clear that it is *very* optional if one *has* a standard tree already.

Of course, the other reason one might want a transformer is to amend the
tree in some manner - for instance, it seems to me that the transformer
is what would sort out intra-document references...

> Is the intent something along the lines of the following? ::
>
>   Writer.write(Transformer.transform(Parser.parse(Reader())))

Yuck! But yes, I would imagine that if one is willing to accept all the
defaults, one might want to do that.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
.. "equal" really means "in some sense the same, but maybe not
.. the sense you were hoping for", or, more succinctly, "is
.. confused with". (Gordon McMillan, Python list, Apr 1998)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From goodger@users.sourceforge.net  Thu Sep 20 05:12:53 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 20 Sep 2001 00:12:53 -0400
Subject: [Doc-SIG] DPS components
In-Reply-To: <005001c1410a$6b83e950$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7CEE504.17B78%goodger@users.sourceforge.net>

I forgot all about the fifth component: "filers" (formerly "output
management"). Filers exist for each method of storing the results of
processing:

- In a single file on disk.
- In a tree of directories and files on disk.
- In a single tree-shaped data structure in memory.
- In a tree of data structures in memory. (Maybe.)

As opposed to readers, parsers, designers, and writers, I see only a
small number of filers; namely, those listed above.

[Tony]
> Of course, the "reader" for plain .rst files is *ever so* simple!

It is the simplest reader, but it still has to do some work, like
hyperlink resolution & footnote numbering. On my to-do-soon list.

[Tony re "designer"]
> Actually, I think this is the interesting stage to have identified.
> I agree it's difficult to think of a name for it.

Rummaging through my dusty brain, I've come up with several
alternatives to "designer" and "transformer": collator, integrator,
interpreter, synthesist. I think "synthesist" is most apt. However,
the point may be moot; because...

Having just scribbled a diagram (ascii art below), I think the
"synthesist" (transformer/designer) is so tightly coupled to the
reader that it becomes an internal implementation detail::

           +--------+                +-------+
           | reader | -------------> | filer |
           +--------+                +-------+
             /   \\                      |
            /     \\                     |
           /       \\                    |
    +--------+    +------------+     +--------+
    | parser |    | synthesist |     | writer |
    +--------+    +------------+     +--------+

(Double lines denote tight coupling, single are loose.)

I can see a variety of synthesists for Python source readers, but
other input types (.rtxt file, PEP, email, etc.) won't need them.

[Tony]
> I assume that you are thus hinting that the output of the designer
> should be "pure" DPS nodes - that is, using only DPS tree nodes that
> are defined in dps/nodes.py, so that *any* DPS writer can be slotted
> in.

Yes, exactly. A generic document is produced and handed over to the
filer & writer.

> At the moment, my code doesn't work like that - the designer is
> producing "extended" nodes, and the writer understands what to do
> with them.

If we follow the diagram above, this goes away. The "pure" document
tree is used between reader, parser, filer, and writer; but the
synthesist is local to the reader and they can share any private
structure they like. Of course, it would be useful for that
structure to be comprehensive and well-documented.

> The example I would use to think about the flow through the system
> would be that of a simple table in the quick reference, which could
> use a directive::
>
>     .. quickreftable:: Directives
>        :link: http://link-to-text
>        ::
>
>          For instance:
>
>            .. graphic:: images/ball1.gif

I would rewrite that as::

    .. quickreftable:: Directives (http://link-to-text)

       For instance:

       .. image:: images/ball1.gif

Decide on a structure for the ``quickreftable`` directive. It can do
with its contents what it likes, including duplicating, parsing,
whatever. Be creative! Check out the directives I've built for
admonitions and images (image [not graphic] and figure).

> Now, given I want the table header to be in pale blue, with the word
> "Directives" in strong italics, and I want the table body to be
> split 50/50 between the two columns, with a pale yellow background,
> *if* I'm outputting to HTML - how do I do that? Bearing in mind that
> if I'm outputting to PDF, I want an entirely different set of
> details.

Style sheets would be useful for that. HTML has them, and a PDF
generator might too. Or writers might have their own collections of
style modules.

> My *suspicion* is that we have three sets of plugins for a directive

Whoa -- too complex. Remember, directives are a parser construct.
They're used to get around the limited syntax. But what comes out of
the parser should have proper structure, not just 'directives'. The
duplication and re-parsing should be done by the directive code being
run by the original parser. If anything else needs to be done, it
should be triggered by the specific element(s) produced by the parser.

> Despite the witterings above, I'm not too concerned about this as
> yet - I have the feeling that it will all come clear in the attempt
> to implement a "clean" system[1]_.

Agreed. But it does help to bash ideas around.

[Garth]
> Is there any problem I've missed that prevents us from spitting out
> plain XML and using transforms to convert it to XHTML? :)

[Tony]
> I would certainly imagine that would be possible, assuming (as I do)
> that all of the DPS node tree information gets dumped as XML
> elements/attributes.

Correct assumption

Nothing prevents XML->XHTML as Garth describes. Simply use the XML
writer and XSLT style sheets for the transformations. (Remi Bertholet
sent me .xsl and .css files; I'll make them available soon). The
problem with that approach is that you need software that understands
the style sheets. Certain versions of certain browsers do, but that's
not good enough for the general case. If there was an XSLT module in
the standard library, we could use it. Until then, we have to be able
to produce real HTML.

[Garth]
> Transformers take a DPS tree and spit out another DPS tree, right?

Correct, modulus the discussion above.

[Garth]
> Is the intent something along the lines of the following? ::
>
>     Writer.write(Transformer.transform(Parser.parse(Reader())))

Perhaps more like::

    Filer.file(Reader.read(inputref, Parser, Synthesist), Writer)

IOW we pass a parser class (or instance) in to the reader because the
parser might be called repeatedly for each doclet (actually, the
reader might auto-detect the markup format & load the parser itself).
The presence of a Synthesist class/instance would depend on the
reader. Same for filers: we pass the writer class/instance/ in since
it may be used for multiple document fragments.

[Tony]
> Of course, the other reason one might want a transformer is to amend
> the tree in some manner - for instance, it seems to me that the
> transformer is what would sort out intra-document references...

Good point. Something to consider when we actually tackle such beasts.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From garth@deadlybloodyserious.com  Thu Sep 20 05:22:32 2001
From: garth@deadlybloodyserious.com (Garth T Kidd)
Date: Thu, 20 Sep 2001 14:22:32 +1000
Subject: [Doc-SIG] DPS components
In-Reply-To: <B7CEE504.17B78%goodger@users.sourceforge.net>
Message-ID: <NBBBIJGOIKKLHHFHILDNOEOPKCAA.garth@deadlybloodyserious.com>

> I forgot all about the fifth component: "filers" (formerly "output
> management"). Filers exist for each method of storing the results of
> processing:
>
> - In a single file on disk.
> - In a tree of directories and files on disk.
> - In a single tree-shaped data structure in memory.
> - In a tree of data structures in memory. (Maybe.)
>
> As opposed to readers, parsers, designers, and writers, I see only a
> small number of filers; namely, those listed above.

Could another word for 'designer' be 'filter'?

> [Tony]
> > Of course, the "reader" for plain .rst files is *ever so* simple!
>
> It is the simplest reader, but it still has to do some work, like
> hyperlink resolution & footnote numbering. On my to-do-soon list.

Oh, hang on. I thought the reader just passed the document to the parser
and the designer did the hyperlink resolution and footnote numbering?

> Nothing prevents XML->XHTML as Garth describes. Simply use the XML
> writer and XSLT style sheets for the transformations. (Remi Bertholet
> sent me .xsl and .css files; I'll make them available soon).

Ooh! Just what I've been waiting for! <jumps around eagerly>

> Perhaps more like::
>
>     Filer.file(Reader.read(inputref, Parser, Synthesist), Writer)

I'm getting completely lost there, I have to admit. I'm sure you have
excellent reasons for making it that complex, but you should be prepared
to write an IBG Dummy's Guide to Using the RST Parser in Your Simple
Python Applications to explain it to the rest of us. :)

> IOW we pass a parser class (or instance) in to the reader because the
> parser might be called repeatedly for each doclet

... such as when the contents of a directive are re-parsed so that we
have a directive containing marked up ReST?

Regards,
Garth.


From tony@lsl.co.uk  Thu Sep 20 10:28:01 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Thu, 20 Sep 2001 10:28:01 +0100
Subject: [Doc-SIG] DPS components
In-Reply-To: <B7CEE504.17B78%goodger@users.sourceforge.net>
Message-ID: <005a01c141b6$8af0ba50$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> I forgot all about the fifth component: "filers" (formerly "output
> management").

Hmm - I don't think I'd bother to separate them out at all, personally.

> [Tony]
> > Of course, the "reader" for plain .rst files is *ever so* simple!
>
> It is the simplest reader, but it still has to do some work, like
> hyperlink resolution & footnote numbering. On my to-do-soon list.

As obviously Garth also did, I was assuming that the "reader" simply
read in the data and then passed (the DPS bits of it) on to the parser
to make sense of - i.e.::

   +--------+         +------------+      +--------+
   | reader | ------> | synthesist | ---> | writer |
   +--------+ \     > +------------+      +--------+
               \   /
                v /
            +--------+
            | parser |
            +--------+

That's why I said the reader was so simple for plain DPS texts - it has
nothing to do but actually read in the data and pass it to the parser -
nothing goes along that top route in that model.

Your diagram::

>
>            +--------+                +-------+
>            | reader | -------------> | filer |
>            +--------+                +-------+
>              /   \\                      |
>             /     \\                     |
>            /       \\                    |
>     +--------+    +------------+     +--------+
>     | parser |    | synthesist |     | writer |
>     +--------+    +------------+     +--------+

is a bit different in form, of course - the "reader" is more powerful.
On the whole, apart from the fact I'd elide the "filer", I think your
diagram is better.

> Having just scribbled a diagram [now above], I think the
> "synthesist" (transformer/designer) is so tightly coupled to the
> reader that it becomes an internal implementation detail::

Hmm. Maybe. But (unless your parser is going to do it all for me) don't
forget that there are still transforms to be done on the DPS node tree
for "pure" reST texts - handling references, for instance. So either we
need to be able to pass down multiple synthesisers, or (more likely) one
synthesiser needs to be able to invoke others.

> I can see a variety of synthesists for Python source readers, but
> other input types (.rtxt file, PEP, email, etc.) won't need them.

OK - maybe you are doing all the work.

*But* what about a synthesist component to (for instance) generate a
contents list? In some cases that would be done via a directive (i.e.,
the author decides to "provide" it), but in other cases the person
transforming the document may decide that they want a contents
inserting. That sounds like a synthesis job to me, not a writer job...

> [Tony]
> > I assume that you are thus hinting that the output of the designer
> > should be "pure" DPS nodes - that is, using only DPS tree nodes that
> > are defined in dps/nodes.py, so that *any* DPS writer can be slotted
> > in.
>
> Yes, exactly. A generic document is produced and handed over to the
> filer & writer.

I've got the refactoring bug now, I'm afraid, 'cos that idea is becoming
increasingly attractive. I may give in and make the next thing I work on
a refactoring of pydps so that it *does* generate a "pure" DPS node tree
as the input to the writer. It would be a useful clarification of
concept, I think. And interesting to see how easy it is for me to still
generate my, erm, colourful HTML.

> > The example I would use to think about the flow through the system
> > would be that of a simple table in the quick reference, which could
> > use a directive::
> >
> >     .. quickreftable:: Directives
> >        :link: http://link-to-text
> >        ::
> >
> >          For instance:
> >
> >            .. graphic:: images/ball1.gif
>
> I would rewrite that as::
>
>     .. quickreftable:: Directives (http://link-to-text)
>
>        For instance:
>
>        .. image:: images/ball1.gif

Ah - but the above doesn't fail as gracefully (imagine if I want to be
able to have invalid constructs in my example, and the poor person
trying to format my text doesn't have the right plugin). Also, whilst I
thought about folding the link in as you've done, I didn't, to allow for
a link in parentheses to appear in the title, if I so wished (yep, being
awkward again). (in fact, my *first* writing of the directive was
essentially identical to yours)

> > Now, given I want the table header to be in pale blue, with the word
> > "Directives" in strong italics, and I want the table body to be
> > split 50/50 between the two columns, with a pale yellow background,
> > *if* I'm outputting to HTML - how do I do that? Bearing in mind that
> > if I'm outputting to PDF, I want an entirely different set of
> > details.
>
> Style sheets would be useful for that. HTML has them, and a PDF
> generator might too. Or writers might have their own collections of
> style modules.

I thought I mentioned CSS. Although there is still a serious requirement
to be able to cope with older browsers that don't support such. I had
imagined that one might have writers having a variety of options on how
to treat styles - some producing CSS directives, others embedded HTML,
others being very simple for use by the visually impaired, etc.

> [Garth]
> > Is the intent something along the lines of the following? ::
> >
> >     Writer.write(Transformer.transform(Parser.parse(Reader())))
>
> Perhaps more like::
>
>     Filer.file(Reader.read(inputref, Parser, Synthesist), Writer)

You're both *way* too concise! What's wrong with some well named
intermediate variables, and some comments!

Given I still don't see the need for a seperate filer (so I'll ignore it
for now), I would see that as being shown to the masses as more like::

    # Reader takes an input stream. An alternative
    # might be FileReader, which takes a filename...
    reader = docutils.pysource.Reader()
    parser = docutils.parser.reST()
    # As you say above, the synthesiser might be
    # "assumed" by the reader in the default case...
    synthesiser = docutils.pysource.Synthesiser()

    reader.language = "en"
    synthesiser.style = "fancy"
    synthesiser.use_tables = 0

    instream = open("c:/reST/example.rtxt")
    try:
        document = reader(instream,parser,synthesiser)
    finally:
        instream.close()

    writer = docutils.writer.HTML()
    # Output as a single file.
    writer(document,file="c:/HTML/example.html")

    # Output as a directory structure.
    # Split out pages at header level 2.
    # (a similar facility in a TeX writer would
    # allow us to do slides...)
    writer(document,directory="c:/HTML/example/",
           index="index.htm",splitlevel=2)

Ah - I've worked out why I don't see the need for Filer now.

Looking at your four examples, the first two (single file or multi-file)
are likely candidates for HTML output (for instance) - but if so, the
HTML writer needs to know what it is doing, it can't be left up to an
external "body" to do it (since the HTML writer needs to insert
appropriate links, generate an index file, etc., in the multiple file
case).

Similarly, it seems to me that the production of tree structures in
memory would be the result of a specific writer (for instance, a DOM
writer - hmm, odd concept). In fact, I'm not even sure I'd call that a
"Writer" - I think I'd think of it as an output synthesiser(!).

Anyway, regardless of details, I still think we're making Big Steps in
understanding the problem (and you might convince me about Filers yet, I
suppose!).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From goodger@users.sourceforge.net  Fri Sep 21 04:29:59 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 20 Sep 2001 23:29:59 -0400
Subject: [Doc-SIG] DPS components
In-Reply-To: <NBBBIJGOIKKLHHFHILDNOEOPKCAA.garth@deadlybloodyserious.com>
Message-ID: <B7D02C76.17BFC%goodger@users.sourceforge.net>

Garth T Kidd wrote:
> Could another word for 'designer' be 'filter'?

It does transformations and augmentations to the data, not just
filtering as in reduction. What I originally meant by the
'designer' was literally something that took the parsed document
trees and did something almost artistic.

> Oh, hang on. I thought the reader just passed the document to the
> parser and the designer did the hyperlink resolution and footnote
> numbering?

Trying to rationalize away the need for a designer/synthesist in
contexts other than Python source, I lumped those functions into the
reader. Maybe there should be a universal 'linker' between the input
side (reader & parser) and the output side (filer & writer)::

    +--------+      +--------+      +-------+
    | reader | ---> | linker | ---> | filer |
    +--------+      +--------+      +-------+
        |                               |
        |                               |
        |                               |
    +--------+                      +--------+
    | parser |                      | writer |
    +--------+                      +--------+

Or just a bunch of functions available for the reader to use if it
so desires.

> I'm getting completely lost there, I have to admit. I'm sure you
> have excellent reasons for making it that complex, but you should be
> prepared to write an IBG Dummy's Guide to Using the RST Parser in
> Your Simple Python Applications to explain it to the rest of us. :)

Fear not, this is just brainstorming, not final decisions. My own
brain is too puny to deal with complex issues for long so the end
product cannot be that mind-warping. I can deal with at most one
complex issue at a time, which is why I'll continue to do a lot of
refactoring of the reStructuredText parser: I'd like to be able to
understand what it's doing once its complexities have fled these
cramped confines.

I'm trying to form a good model of how this thing will work in a
generic and flexible way. Following the XP way, though, I won't be
coding for the general case before it's necessary.

> > IOW we pass a parser class (or instance) in to the reader because
> > the parser might be called repeatedly for each doclet
>
> ... such as when the contents of a directive are re-parsed so that
> we have a directive containing marked up ReST?

No, what I meant was that the parser will be used multiple times,
called once for each docstring in a module or package, instead of just
once overall (which is what ``Parser.parse(Reader())`` implies). The
integration of documents happens after they're all parsed
individually.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Fri Sep 21 04:31:17 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 20 Sep 2001 23:31:17 -0400
Subject: [Doc-SIG] DPS components
In-Reply-To: <005a01c141b6$8af0ba50$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D02CC4.17BFC%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> *But* what about a synthesist component to (for instance) generate
> a contents list? In some cases that would be done via a directive
> (i.e., the author decides to "provide" it), but in other cases the
> person transforming the document may decide that they want a
> contents inserting. That sounds like a synthesis job to me, not a
> writer job...

Good point. TOC generation itself is an example of what I'd call a
filter. Index generation too. I don't know where to put those types
of document augmentation steps (putting the TOC or index back into
the document).

> I've got the refactoring bug now, I'm afraid

Glad to hear it.

> I may give in and make the next thing I work on a refactoring of
> pydps so that it *does* generate a "pure" DPS node tree as the input
> to the writer.

Please note that the tree described by dps.nodes & gpdi.dtd is
probably not sufficient for the writer to do a proper job. Extensions
to the elements (layout and formatting attributes) may be needed.

(Maybe we need 'stylists' after all! ;-)

> > I would rewrite that as::
> >
> >     .. quickreftable:: Directives (http://link-to-text)
> >
> >        For instance:
> >
> >        .. image:: images/ball1.gif
>
> Ah - but the above doesn't fail as gracefully (imagine if I want to
> be able to have invalid constructs in my example, and the poor
> person trying to format my text doesn't have the right plugin).

It's up to the directive to parse its contents. No parsing is done
automatically. If the 'quickreftable' directive isn't there, a warning
will be generated with the directive source as a literal block. If the
directive parses part of its contents normally, and *that* contains an
unknown directive, then the parsed result will contain a system
warning. I don't see the problem.

> Also, whilst I thought about folding the link in as you've done, I
> didn't, to allow for a link in parentheses to appear in the title,
> if I so wished (yep, being awkward again).

The directive can be made to extract only the *last* parenthesized
link as the "details" link.

> Ah - I've worked out why I don't see the need for Filer now.
>
> Looking at your four examples, the first two (single file or
> multi-file) are likely candidates for HTML output (for instance) -
> but if so, the HTML writer needs to know what it is doing, it can't
> be left up to an external "body" to do it (since the HTML writer
> needs to insert appropriate links, generate an index file, etc., in
> the multiple file case).

OK, so the 'filer' is just an option to the writer. That's cool too.
Implementation details. ;-)

So now the diagram looks something like this::

    +--------+      +--------+      +------------+      +--------+
    | READER | ---> | linker | ---> | transforms | ---> | WRITER |
    +--------+      +--------+      +------------+      +--------+
        |                             TOC, index,           |
        |                             etc.                  |
        |                             (optional)            |
    +--------+                                          +-------+
    | PARSER |                                          | filer |
    +--------+                                          +-------+

UPPERCASE names are major DPS components, and lowercase names are
groups of common services used as required. If synthesists and/or
stylists are also needed, I'll have to rotate the diagram.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Fri Sep 21 05:27:59 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 21 Sep 2001 00:27:59 -0400
Subject: [Doc-SIG] rtxt2html style sheets in the sandbox
Message-ID: <B7D03A0E.17C07%goodger@users.sourceforge.net>

I've put Remi Bertholet's style sheets in reStructuredText's new sandbox:

    http://structuredtext.sourceforge.net/sandbox/rtxt2html/

Although I do not endorse all aspects of the layout & formatting choices
Remi made, I do think this is very, *very* cool. Thanks Remi!

An example reStructuredText file (by Remi, slightly edited by me), processed
to XML and containing a link to the style sheets (using the new
``--styledxml`` option to tools/quicktest.py) is at:

    http://structuredtext.sourceforge.net/sandbox/rtxt2html/example1.xml

You'll need a browser compatible with .xsl and .css; I know M$ IE5 works.

The style sheets are not complete. Several constructs were not ready when
Remi sent me the files (tables, image directive, document with a title,
auto-numbered footnotes). Remi, it would be great if you could add these to
the style sheets (however you'd like them to look!).

A snapshot of the entire sandbox is available at:

    http://structuredtext.sourceforge.net/rst-sandbox-snapshot.tgz

Also available from CVS of course.

If you use ``quicktest.py --styledxml`` on other files, the style sheets
have to be in the same directory as the output for it to work.

Enjoy!

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Fri Sep 21 10:02:20 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Fri, 21 Sep 2001 10:02:20 +0100
Subject: [Doc-SIG] DPS components
In-Reply-To: <B7D02CC4.17BFC%goodger@users.sourceforge.net>
Message-ID: <006401c1427c$1ebf3570$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Good point. TOC generation itself is an example of what I'd call a
> filter. Index generation too. I don't know where to put those types
> of document augmentation steps (putting the TOC or index back into
> the document).

It may be as simple as allowing a later stage to "call" an earlier stage
to "refine" an aspect of the document. But it's the sort of thing that
normally becomes clear when you write it once, then look at it for a
while and go "no, that's wrong, the *obvious* way is to do it like
this...".

> > I've got the refactoring bug now, I'm afraid
>
> Glad to hear it.

Oh, wanting to rewrite code to make it better isn't new - I've liked
working that way since I started! What's new is that there is now some
respectability to the idea - it isn't (necessarily) just seen as
programmer's wasting time by either wanting to refine something that's
"good enough" (by someone else's criteria, which don't take into account
things like maintenance), or even (!) by "not writing it right in the
first place".

In this instance, though, your ideas have tipped me over from being a
little bit unsure that the tack I had taken for output was right into
thinking that it is definitely inelegant, and I can see a more elegant
route (with other benefits as well - but then, that's one of the
definitions of "elegance" in programming).

> Please note that the tree described by dps.nodes & gpdi.dtd is
> probably not sufficient for the writer to do a proper job. Extensions
> to the elements (layout and formatting attributes) may be needed.

Oh, for sure - but the easiest way to find out is to try!

I know for a start that I will likely want to able to add a "style"
attribute to most things (optionally, of course), and also that we will
need a ``Span`` Element node - c.f. the <span> tag in HTML, etc. - this
allows one to identify a segment of the tree (I'll settle for a
subtree!) as being linked for stylistic purposes.

> (Maybe we need 'stylists' after all! ;-)

The Hairdresser class may still be in front of us.

More seriously, I'm still unsure of *how* one chooses a particular style
and implements its details - but again, this sort of thing comes clear
in the wash...

> > >     .. quickreftable:: Directives (http://link-to-text)
> > >
> > >        For instance:
> > >
> > >        .. image:: images/ball1.gif
>
> It's up to the directive to parse its contents. No parsing is done
> automatically. If the 'quickreftable' directive isn't there, a warning
> will be generated with the directive source as a literal block. If the
> directive parses part of its contents normally, and *that* contains an
> unknown directive, then the parsed result will contain a system
> warning. I don't see the problem.

Ah - all is clear. You're right - no problem.

> OK, so the 'filer' is just an option to the writer. That's cool too.
> Implementation details. ;-)
>
> So now the diagram looks something like this::
>
>     +--------+      +--------+      +------------+      +--------+
>     | READER | ---> | linker | ---> | transforms | ---> | WRITER |
>     +--------+      +--------+      +------------+      +--------+
>         |                             TOC, index,           |
>         |                             etc.                  |
>         |                             (optional)            |
>     +--------+                                          +-------+
>     | PARSER |                                          | filer |
>     +--------+                                          +-------+
>
> UPPERCASE names are major DPS components, and lowercase names are
> groups of common services used as required. If synthesists and/or
> stylists are also needed, I'll have to rotate the diagram.

"linker" stitches together references and such?

"transforms" should, of course, be "transformers", to be the same part
of speech.

I'm still unhappy with the *name* "filer", especially since your game
plan for it involves generating memory structures "half" the time -
"output manager" is slightly more accurate, if too verbose. But it
doesn't matter enough to worry about, I think.

Tibs

(today's "drat" moment - work has IE4 and Netscape4.07 (don't ask), so
there's no easy way I can look at the transformed XML output. Humph.
I'll have to wait until tonight at home...)
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
.. Haskell is the most Pythonic of all the languages that are entirely
.. unlike Python <0.9 wink> (Tim Peters)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Fri Sep 21 22:19:58 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Fri, 21 Sep 2001 17:19:58 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010921211958.C4F3924231@grendel.zope.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Added more discussion of user-defined exceptions, more descriptions for
the xml.parsers.expat module.


From tony@lsl.co.uk  Mon Sep 24 15:04:46 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 24 Sep 2001 15:04:46 +0100
Subject: [Doc-SIG] Grump about field lists
In-Reply-To: <006401c1427c$1ebf3570$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <007a01c14501$de437050$f05aa8c0@lslp7o.int.lsl.co.uk>

Warning - I may be disagreeing with opinions of my own that I've pushed
in the past. But then, you should all be used to that by now...

A "normal" field list::

    :first: something
    :second: something else
    :third: and again

produces a (sub)tree structure something like the following::

    <field_list>
        <field>
            <field_name> first
            <field_body> something
        <field>
            <field_name> second

and so on. However, some field names are treated specially::

    :author: Me
    :Date: 29th February 200

gives::

    <author> David Goodger
    <date> 29th February 200

I believe that this special treatment is (a) because some fields are
felt to be "special" and should thus be easy to extract from the tree,
and (b) because when what evolved into field lists was being proposed on
Doc-SIG, this was the sort of thing that was expected to happen for
*all* field names (I gloss over history greatly, of course).

Note that I realise that this special casing is only done in the block
at the *start* of the document (foreshadow_)...

However, I now find that I don't like this division of fields into
"normal" and "special", for two broad reasons.

Broad reason 1: implementation
------------------------------
When trying to output HTML from a DPS tree, it is useful if I can
produce a decent result without any pre-processing of the tree. The only
place that I can't currently see this as being possible (subject to a
*little* bit of special casing when it comes to folding in indirect
hyperlinks) is for the "special" fields above.

For "normal" field lists, I have a *list* - that is, a grouping of
adjacent field list items - which facilitates treating them *as* a
group. For the "special" fields, there is no such linkage.

This could, of course, be solved to *some*extent by insisting that these
field list items *also* become children of a <field_list> node -
obviously one would then allow mixing, so one could do::

    :Author: Me
    :Cat: in a hat

to obtain::

    <field_list>
        <author> Me
        <field>
            <field_name> Cat
            <field_body> in a hat

although that seems messy in a different way...

(Hmm - and surely there *is* a Good Thing about being able to point to
the Bibliographic Data *as* a subtree of the document. I think that's
actually a very important point. So, whatever else is done, can these
things please be shifted into a <field_list> subtree.)

Broad reason 2: theory
----------------------
Having had my mind drawn to this, I think I have some general objections
to the idea, anyway.

The first is simply that I don't think it is necessary. I think that it
should be possible to handle any action that can be done with an
``<author>`` tag as easily with a ``<field>`` that has the correct
subtree. And if it isn't, then transfer the field name to be an
attribute, so that::

    <field>
        <field_name> Fred
        ...

becomes::

    <field name="Fred">
        ...

The second is that I'm a bit unhappy with the ad-hoc nature of adding
names - let us say that I have a document with a field name "History"
(not unreasonable). What happens if reST introduces this as a standard
tag at some time - suddenly, my customised Writer code will get *very*
confused at the new parsing.

Speaking of which, I especially don't like the out-reference to RCS/CVS
keywords. The `RCS keyword recognition` section says that any RCS
keyword shall be a bibliographic keyword. This is unfriendly to the poor
user, because it *requires* them to study a different document before
they can figure out unused keywords (even if they're using DPS/reST to
write a discourse on cats, or something inherently non-programming), and
also because it nails us to a moving target over which we have no
control (we can't stop CVS adding a new keyword to its list, unlikely as
we may hope that to be).

Also, in that section, "status" and "date" are named, but these are
already mentioned in the previous section (and do you *really* think you
can automatically convert all dates into ISO 8601 format? what about
10/12/01? (which is the month? and the day? and what century? (not that
I'd guarantee that that *is* the year, if I were you!))

If we're going to support RCS/CVS keywords directly, can we please
prefix them with RCS (or CVS) - e.g., "CVSDate" or "cvsdate" - to make
it clear what we mean?

.. _foreshadow:: Hmm, that's another problem with the "special
   treatment" - it only happens in one place. So I probably have
   to be able to cope with the same "quantity" represented by
   either of two means, anyway, if I want to go data-mining.

Tibs

(Hmm - having said I thought that :title: was unlikely to happen, there
it is for all to see. Oh well, I never was much good at channeling. Or
remembering.)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Mon Sep 24 22:16:35 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 24 Sep 2001 17:16:35 -0400
Subject: [Doc-SIG] request for assistance with library reference
Message-ID: <15279.41651.179222.3987@grendel.zope.com>

  The schedule for Python 2.2 has it being released mid-December, and
the documentation tasks are piling up.  If anyone would like to help,
I'd certainly appreciate it!
  The things that would be easiest for someone to pick up would be
reference documentation for modules; the docs for a module are highly
self-contained and very formulaic in presentation.  The Python bug
tracker specifically tells me that people are interested in seeing the
pydoc and httplib modules documented.  pydoc has never been
documented, and the httplib documentation is for an old version of the
interface.
  I'm sure there are other modules that need to be documented as
well.
  Any takers?  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Mon Sep 24 22:22:50 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Mon, 24 Sep 2001 17:22:50 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010924212250.BC6E628845@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Added documentation for several of the new C functions added for
Python 2.2 in the Python/C API reference manual.


From goodger@users.sourceforge.net  Tue Sep 25 03:37:28 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Mon, 24 Sep 2001 22:37:28 -0400
Subject: [Doc-SIG] Re: Document titles (was RE: [Docstring-develop]  DPS - possible
 bugs/features)
In-Reply-To: <007701c144db$7b029d20$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D56626.17E07%goodger@users.sourceforge.net>

[As it may be of more widespread interest, I'm moving this discussion over
to Doc-SIG.]

Tony J Ibbs (Tibs) wrote:
> Ueli Schl=E4pfer gave a good explanation of why David Goodger had chosen
> to do something I didn't understand - namely make a document with a
> title at the start have no section within it (except it was subtler than
> that).

For those new to this discussion, please see Tony's original post at
http://www.geocrawler.com/lists/3/SourceForge/12823/0/6666616/ (Ueli's
followup is http://www.geocrawler.com/lists/3/SourceForge/12823/0/6667890/)=
.

To begin, let me reiterate what I wrote on Friday (21 Sept):

    ... Perhaps this parser-specific transformation should be made
    optional [#]_. Then you can treat everything as generic sections
    until integration is complete.

    .. [#] Along with bibliographic field list interpretation, RCS
       keyword filtering, and whatever other conveniences we dream up.

I will make these transformations optional, or remove them altogether
and relocate them to a separate module. The only problem with this
separation is that these transformations are markup-specific; they may
not be necessary with another markup syntax.

I don't know the best way to handle this situation yet; suggestions
are welcome.

[Tony]
> By the way, I did check the documentation, and it seemed to me that
> the current documentation indicated that a title would cause a
> section to be started - so David, if you want to perform this
> promotion, then it needs to be documented (unless I've missed it
> *again*).

No, you haven't missed anything. It wasn't documented; I guess it was
just so obvious to me that I never thought to write it down.

I'll work on an explanation for the spec (taking into account this
discussion, of course).

> On the other hand, I don't actually see why the DPS system should
> have to do the promotion for the user, when it's not clear that it
> is always wanted (in other words, why is it up to the DPS system to
> decide that the case of a single section is special, and then have a
> sudden disjunction in its behaviour for two sections - that's my
> inner pedant objecting!).

True, there's some ambiguity here. But the user needs some way to
indicate the document title.

> Regardless, an immediate solution to resolve the docstring case (and
> possibly a useful thing to do anyway) would be to have an argument
> to the Parser that states upfront that we are working on a document
> *fragment*

That's one possibility, yes.

> Broadly, HTML common practise treats a document as having a single
> title at the top, which is used for both <title> and <h1>, and the
> "section hierarchy" (if any) starts with an <h2>.

That's a transformation the HTML writer will want to do to the
document: convert the document title to both <HEAD><TITLE> and <H1>.

> Maybe (horrors) we should reserve one specific markup form to mean
> "overall title"::

That won't fly.

> Or perhaps we'll have to resort to::
>=20
>     :Title: Document title

PEPs use this, but I wouldn't want to use it in my documents.

> Somehow, I don't see David liking either of those...

You know me so well!

> My problem is that I'm trying to write formatters for *any* document
> that might come in (yes, I know I'm writing pydps/pysource, but I
> want the Writer to work for any document)

The "Writer" being an "HTML Writer"?

> so we have to be able to cope with:
>=20
> 1. Document with no titles at all
> 2. Document with one title (OK - David does that)
> 3. Document with more than one title (at the same level)
>    - which in essence *really* resolves back to case 1.

1 & 3 merge to become "Document without exactly one title". I'd say
that warrants an error from the HTML writer since a title is a
prerequisite. Or the HTML writer can call that document "Untitled".
Or both. The TeX writer need not complain.

> I'm afraid that the only "perfect" solution I can see for that (in
> the sense of *predictable*) is to require the user to indicate that
> they *do* have a document title, and that it is *this* thing, here.
> That then makes them aware of the problem, also, which I think is a
> necessary thing (otherwise, surprise will eventuate).

I think, documented, the promotion of a single section to document
status is natural and predictable. However, it's not the parser's job.

Is there an alternative?

> David Goodger wrote:
> > It is actually intended that by the time the document tree gets to
> > the writer, it must have a title. The parser can't always
> > determine the title by itself, such as in PySource mode. The
> > PySource reader is expected to supply all the titles as
> > appropriate.
>=20
> Hmm. In PySource mode, the parser should not be trying to introduce
> titles - it is, after all, handling arbitrary document fragments,
> and can't know anything about their global scope (unless it is
> told!).

We'll turn off the transformations for PySource mode. The parser will
no longer introduce titles; that's the reader's job.

> *If* the final tree shall always have a title, where does it come
> from if the document author didn't provide one? Surely in that case
> it is not up to the *parser* to decide on what a title should be -
> that is up to the application. So one has three options:
>=20
> 1. The parser makes one up (yuck)
> 2. The application makes one up (yuck)
> 3. An error is generated (yuck)
>=20
> I'd vote for 3 ...

Probably me too, in those cases where it's necessary. If the writer
doesn't need a document title, it need not complain.

[Ueli]
> The source filename isn't known to the writer, is it?

[Tony]
> I think that the sourcefile as an optional attribute on the document
> is probably a useful thing, as well.

I've added an optional attribute 'source' (not just files!) to
'basic.atts', so all elements will get it. That way, each fragment of
a document can record its origin.

--=20
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Tue Sep 25 03:43:38 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Mon, 24 Sep 2001 22:43:38 -0400
Subject: [Doc-SIG] URI schemes (was Re: [Docstring-develop] DPS - possible
 bugs/features)
In-Reply-To: <007801c144dc$5ca6e560$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D56799.17E08%goodger@users.sourceforge.net>

[Again, of general interest. Especially: Does anyone know of a URI scheme
registry or official list? (URI schemes are "http", "ftp", "mailto", etc.;
the part of a URI before the ":".)]

[Tony]
> Hmm. Will `a:b` be treated as a URI? (I haven't tested it).

Yes, it will, and in fact you *have* tested it! ``[a:b]`` turned into
``[<link refuri="a:b">a:b</link>]`` in the example from your original
message. (The square brackets are not significant.)

> Is ``a:b`` *really* likely to be a sensible URI, given that ``a`` is
> entirely "local"?

What do you mean by "local"?

> Should we be treating with the whole possible gamut of URIs, or
> restricting ourselves to those most likely?

There are two approaches:

1. Recognize all possible URI schemes, based on the grammar from
   RFC2396. This has the unwanted side effect that ``a:b`` is
   accidentally recognized as a URI. The workaround is to use inline
   literals (not always correct: "the signal:noise ratio") or escape
   the colon (ugly).

2, Recognize only "registered" URI schemes. Accidents like ``a:b``
   won't happen. The disadvantage is that new URI schemes need to be
   added to the parser. I have yet to find a definitive registry of
   URI schemes (anybody know of one?), and I don't want to spend the
   rest of my life adding new schemes as they pop up.

Currently the reStructuredText parser takes approach #1. I wouldn't
want to attempt #2 without an official & complete URI scheme reference.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Tue Sep 25 04:01:33 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Mon, 24 Sep 2001 23:01:33 -0400
Subject: [Doc-SIG] Re: Grump about field lists
In-Reply-To: <007a01c14501$de437050$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D56BCC.17E09%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> However, some field names are treated specially::
> 
>     :author: Me
>     :Date: 29th February 200
> 
> gives::
> 
>     <author> David Goodger
>     <date> 29th February 200

Please note, there is *no* automatic conversion of "Me" into "David
Goodger"! !!! (I'm sure it was a typo. But could you imagine the
arrogance of a programmer adding such a "feature" to his code?)

> Note that I realise that this special casing is only done in the
> block at the *start* of the document (foreshadow_)...

That's important to the discussion.

> I believe that this special treatment is (a) because some fields are
> felt to be "special" and should thus be easy to extract from the
> tree, and (b) because when what evolved into field lists was being
> proposed on Doc-SIG, this was the sort of thing that was expected to
> happen for *all* field names (I gloss over history greatly, of
> course).

It's not that some fields are "special". "Bibliographic Field List
Context" is just a way of getting some extra functionality out of the
limited syntax.

I felt that the bibliographic elements (author, date, version, etc.)
were useful to a generic document, so added them to the DTD.
Naturally, I wanted to provide some mechanism to the reStructuredText
authors to include those elements in their doucments. The past Doc-SIG
discussions and PEP syntax suggested that field lists were the way to
go. It's a bit of syntax overloading, but practical.

> However, I now find that I don't like this division of fields into
> "normal" and "special", for two broad reasons.
> 
> Broad reason 1: implementation
> ------------------------------
> When trying to output HTML from a DPS tree, it is useful if I can
> produce a decent result without any pre-processing of the tree.

So you don't want to deal with the bibliographic elements at all?
In order to deal with a richly structured document, you're going to
have to do some pre-processing, tree transformations, and addition
of boilerplate text here and there.

> For "normal" field lists, I have a *list* - that is, a grouping of
> adjacent field list items - which facilitates treating them *as* a
> group. For the "special" fields, there is no such linkage.

The bibliographic elements do not constitute a list. They do
constitute a group of elements; we could group them together in a
'docinfo' (or 'bibliographic' or 'metadata') element if that would
help. (But then we face the question: what goes inside 'docinfo', and
what doesn't? I think most people would agree 'title' is too generic
to go inside 'docinfo'. What about 'subtitle'? See the Docbook DTD for
a cautionary example: its 'bookinfo' etc. elements contain 'title', as
does the 'book' itself.)

I envisage these bibliographic elements being laid out in various
ways: like a title page of a book, like the first page of the Python
Library Reference, etc. I do *not* see them being laid out as a field
list, at least not exclusively, and not for anything but experimental
output (e.g., verbatim output of the raw input for verification
purposes).

For a typical bibliographic field list::

    Sand-Nymphs of Mars
    ===================

    :Author: Kilgore Trout
    :Contact: trout@fictional.org
    :Date: 24 September, 2001

I see this being laid out something like this::

                         SAND-NYMPHS OF MARS


                            Kilgore Trout
                        (trout@fictional.org)
                          24 September, 2001

> This could, of course, be solved to *some* extent by insisting that
> these field list items *also* become children of a <field_list> node

I don't think this is a good idea. 'field_list' is just used as a
syntax vehicle; it's not the end result.

> (Hmm - and surely there *is* a Good Thing about being able to point to
> the Bibliographic Data *as* a subtree of the document.

Why?

If you just need an element around them for ease of processing, a
'docinfo' would be easy to add.

> Broad reason 2: theory
> ----------------------
> Having had my mind drawn to this, I think I have some general
> objections to the idea, anyway.
> 
> The first is simply that I don't think it is necessary. I think that
> it should be possible to handle any action that can be done with an
> ``<author>`` tag as easily with a ``<field>`` that has the correct
> subtree.

Yes, by identifying the field name and doing a tree transformation.
That's what's being done.

> And if it isn't, then transfer the field name to be an attribute

Same thing.

> The second is that I'm a bit unhappy with the ad-hoc nature of adding
> names - let us say that I have a document with a field name "History"
> (not unreasonable). What happens if reST introduces this as a standard
> tag at some time - suddenly, my customised Writer code will get *very*
> confused at the new parsing.

That's a backwards incompatibility issue, same as what happens in
Python when a new keyword is added (like 'yield').

Say we do add a new bibliographic field at some point in the future.
Some code would break. Since we're following good XP practise, we'd
write a unit test and see the problem right away. At this point the
TibsWriter software is part of the DPS/docutils package in Python's
standard library. If we don't fix TibsWriter before checking the
changes in to the Python codebase, the Python regression tests fail
and we get hell from Guido. We fix the problem, either by backing
out the changes or adapting TibsWriter to the changes.

For any 3rd party 'writers' out there not part of the core, they face
the same issues as a Python syntax change. Hopefully there's
sufficient warning. If not, there's a small amount of maintenance to
be done. Hopefully we learn from any backlash that adequate warning
is not optional.

> Speaking of which, I especially don't like the out-reference to
> RCS/CVS keywords. The `RCS keyword recognition` section says that
> any RCS keyword shall be a bibliographic keyword.

No, it doesn't say that. It says "In the context of bibliographic
field lists". Perhaps that could use some explanation (below).

> This is unfriendly to the poor user, because it *requires* them to
> study a different document before they can figure out unused
> keywords

At first I was exasperated, thinking you were being exceedingly
pedantic ;-), but I'll assume that it's just the terseness of that
part of the spec that's to blame. Here's an expanded explanation
(and this is how the parser actually works):

    The RCS keyword processing only kicks in when all of these
    conditions hold:
    
    1. The field list is in bibliographic context (first non-comment
       contstruct in the document, after a document title if there is
       one).
    
    2. The field name is a registered bibliographic field name.
    
    3. The sole contents of the field is an expanded RCS keyword, of
       the form '$Keyword: data $'.

The only people who are going to be putting RCS keywords in their
documents already know about RCS keywords, so there's (almost) no
danger of an accident. I can't see someone putting dollar-signs
around a word by accident. And for an RCS keyword to be expanded,
the file has to be *stored* under RCS or CVS, so if a keyword
accident does happen, the user has a greater problem than we need
address.

> (even if they're using DPS/reST to write a discourse on cats, or
> something inherently non-programming)

Then the .rtxt file wouldn't be stored under RCS/CVS.

> also because it nails us to a moving target over which we have no
> control (we can't stop CVS adding a new keyword to its list,
> unlikely as we may hope that to be).

An acceptable risk, I think.

> Also, in that section, "status" and "date" are named, but these are
> already mentioned in the previous section

Maybe the spec is misleading. '$Date: ... $' is not only processed in
the 'date' field. Any RCS keyword can be processed in any
bibliographic field. I was just using 'status' and 'date' as examples.

If this is too loose, we *could* define that '$Date$' is only
recognized in 'date' fields, '$Revision$' only in 'version' or
'revision' fields, etc. But then we limit the possibilities; I'm sure
there would be a complaint eventually.

> (and do you *really* think you can automatically convert all dates
> into ISO 8601 format?

Yes, I do. The RCS date field is defined as expanding to 'YYYY/MM/DD
hh:mm:ss' (in UTC).

(I just looked up the RCS manpage. It doesn't actually specify the
date expansion format, but from experience I know the above format to
be correct. If it is ever different, the date-specific pattern in the
parser won't match, and only the '$'s and 'Date:' would be removed and
the raw data left behind; no harm done.)

> If we're going to support RCS/CVS keywords directly, can we please
> prefix them with RCS (or CVS) - e.g., "CVSDate" or "cvsdate" - to
> make it clear what we mean?

That would defeat the purpose. We have bibliographic elements, like
'date'. People will use RCS keywords in their source files, because
they're automatically updated. The RCS keyword processing is merely a
cosmetic convenience, tossing the cruft.

> .. _foreshadow:: Hmm, that's another problem with the "special
>    treatment" - it only happens in one place. So I probably have
>    to be able to cope with the same "quantity" represented by
>    either of two means, anyway, if I want to go data-mining.

I don't follow; please explain.

> (Hmm - having said I thought that :title: was unlikely to happen,
> there it is for all to see.

You're referring to your docstring-develop message?

The reason for 'title' being recognized as a bibliographic field name
was for generality, and specifically to support the PEP header syntax
as an alternate reader/"input mode". For the PEP Reader, the
bibliographic elements should be extended to include all of the PEP
header fields. If the redundant inclusion of 'title' in the standard
set of bibliographic fields is painful, it would be easy to remove it
and only put it back for the PEP Reader.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From kalle@gnupung.net  Tue Sep 25 10:07:43 2001
From: kalle@gnupung.net (Kalle Svensson)
Date: Tue, 25 Sep 2001 11:07:43 +0200
Subject: [Doc-SIG] request for assistance with library reference
In-Reply-To: <15279.41651.179222.3987@grendel.zope.com>
References: <15279.41651.179222.3987@grendel.zope.com>
Message-ID: <20010925110743.A1226@stig.lysator.liu.se>

[Fred L. Drake, Jr.]
> 
>   The schedule for Python 2.2 has it being released mid-December, and
> the documentation tasks are piling up.  If anyone would like to help,
> I'd certainly appreciate it!

Well, after lurking on this list for a while, I guess I should do
something too. <wink>  I'll try to update the httplib docs.
I'm not a native english speaker, but I'll do my best.

Peace,
  Kalle
-- 
[ Thought control, brought to you by the WIPO! ]
[ http://anti-dmca.org/ http://eurorights.org/ ]


From kalle@gnupung.net  Tue Sep 25 10:54:11 2001
From: kalle@gnupung.net (Kalle Svensson)
Date: Tue, 25 Sep 2001 11:54:11 +0200
Subject: [Doc-SIG] Latest version of doc sources?
Message-ID: <20010925115411.B1226@stig.lysator.liu.se>

Hi.

The httplib documentation in CVS seems older than
http://python.sourceforge.net/devel-docs/lib/module-httplib.html.

Is the source of this newer version available somewhere?  Sorry if I've missed
something obvious.

Peace,
  Kalle
-- 
[ Thought control, brought to you by the WIPO! ]
[ http://anti-dmca.org/ http://eurorights.org/ ]


From tony@lsl.co.uk  Tue Sep 25 11:11:25 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 25 Sep 2001 11:11:25 +0100
Subject: [Doc-SIG] Re: Grump about field lists
In-Reply-To: <B7D56BCC.17E09%goodger@users.sourceforge.net>
Message-ID: <008401c145aa$6f12e270$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Tony J Ibbs (Tibs) wrote:
> > However, some field names are treated specially::
> >     :author: Me
> > gives::
> >     <author> David Goodger
>
> Please note, there is *no* automatic conversion of "Me" into "David
> Goodger"! !!!

Aaagh! Yes, it was a typo (well, a cut-and-paste-o). Sorry.

> I felt that the bibliographic elements (author, date, version, etc.)
> were useful to a generic document, so added them to the DTD.
> Naturally, I wanted to provide some mechanism to the reStructuredText
> authors to include those elements in their doucments. The past Doc-SIG
> discussions and PEP syntax suggested that field lists were the way to
> go. It's a bit of syntax overloading, but practical.

Well, the good news is that your explanation (mostly omitted) made lots
of sense to me, and I'm withdrawing much of my worry (which is a relief
to *me*, at least).

I would be happiest, though, if the bibliographic items were grouped
together under a single DPS tree node - I don't care about the name, and
don't particularly worry whether the term "list" is in it or not (the
tree requires things to be ordered by being a tree, so *all* collections
are going to be instantiated as "lists" regardless, whether one
conceptualises them as list or group!). I'm happy with any of the tags
you came up with - "docinfo" is OK by me.

> (But then we face the question: what goes inside 'docinfo', and
> what doesn't? I think most people would agree 'title' is too generic
> to go inside 'docinfo'. What about 'subtitle'? See the Docbook DTD for
> a cautionary example: its 'bookinfo' etc. elements contain 'title', as
> does the 'book' itself.)

Hmm - I *like* the idea of placing them together *because* it makes them
easy to find - and if we are saying that they must "come together as the
first thing in the document or the second thing after a title" then we
are *making* them all come together. That, to me, encourages me to
represent that fact.

I don't know the Docbook DTD, and don't *quite* see why it is obviously
bad to have "title" in two places...

> I envisage these bibliographic elements being laid out in various
> ways: like a title page of a book, like the first page of the Python
> Library Reference, etc. I do *not* see them being laid out as a field
> list, at least not exclusively, and not for anything but experimental
> output (e.g., verbatim output of the raw input for verification
> purposes).

This was (probably) the crucial point I was missing. I think your
example would sit well in the document, as an explanation (of one of the
reasons) for doing this. Of course, to me, it's also another reason to
group them together under one node of the tree...

> > (Hmm - and surely there *is* a Good Thing about being able
> > to point to the Bibliographic Data *as* a subtree of the
> > document.
>
> Why?

Erm - "because"...

It's difficult for me to articulate - we've *said* they belong together.
We want to *use* them together (in your example, to produce title page
info). It makes it slightly easier to tell if we have a particular item
or not. We don't want to lose them (!) - it just feels neater to keep
them in one package so they don't "fall out". It makes the tree tidier.
Maybe I'm just compulsive...

> If you just need an element around them for ease of processing, a
> 'docinfo' would be easy to add.

> At first I was exasperated, thinking you were being exceedingly
> pedantic ;-),

No, I really hadn't understood, as your further explanation makes clear
to me. Specifically:

> 3. The sole contents of the field is an expanded RCS keyword, of
>    the form '$Keyword: data $'.

I was assuming that "RCS keyword" meant the field list name - i.e., that
when you said "RCS date" you meant something that looked like::

    :date: <some text>

I hadn't realised that you actually meant::

    :name: <RCS keyword>: <appropriate text>

Looking back at the document, it still wouldn't be obvious to me
(whereas your new explanation does make it obvious), despite your use of
the words "The 'RCSfile' keyword" in the explanation - that use of
"keyword" didn't trigger strongly enough for me, I guess, probably
because you then go on to say "the 'Date' keyword" when I was thinking
about the "Date" field name).

As you've put it now, it's plainly a good idea (which makes me sigh with
relief, as I *know* you have a good sense of design).

> > .. _foreshadow:: Hmm, that's another problem with the "special
> >    treatment" - it only happens in one place. So I probably have
> >    to be able to cope with the same "quantity" represented by
> >    either of two means, anyway, if I want to go data-mining.
>
> I don't follow; please explain.

Just that if I have::

    :something: text

at the start of the document, it will produce a different representation
in the tree structure than if it occurs elsewhere. In my original
worries, that was much more significant (because (a) I hadn't gotten the
"title page" scenario, and (b) I was worried about random "RCS" field
names).

> > (Hmm - having said I thought that :title: was unlikely to happen,
> > there it is for all to see.
>
> You're referring to your docstring-develop message?
>
> The reason for 'title' being recognized as a bibliographic field name
> was for generality, and specifically to support the PEP header syntax
> as an alternate reader/"input mode". For the PEP Reader, the
> bibliographic elements should be extended to include all of the PEP
> header fields. If the redundant inclusion of 'title' in the standard
> set of bibliographic fields is painful, it would be easy to remove it
> and only put it back for the PEP Reader.

I'm actually not worried about it - but I think that some guidance could
be given to usage (when we figure out what that guidance is!). I wonder
if we're going to end up with things like the LaTeX document styles,
where one declares what "bibliographic elements" one wants according to
the function of the document.

Hmm - does that mean that we should add "Mode" or "Style" as an extra
such bibliographic field name - I suspect we should (I'm sure this was
discussed earlier).

Examples::

    :Mode: Article
    :Mode: Book
    :Mode: PEP
    :Mode: HTML

(that last one is being a bit naughty, of course - maybe it should be
"HTML page").
Or perhaps one even has::

    :Mode: HTML
    :Style: Article

where the "mode" indicates that we are enabling HTML specific things
(support for <hr>, etc.)

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 25 11:11:28 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 25 Sep 2001 11:11:28 +0100
Subject: [Doc-SIG] URI schemes (was Re: [Docstring-develop] DPS - possible
 bugs/features)
In-Reply-To: <B7D56799.17E08%goodger@users.sourceforge.net>
Message-ID: <008501c145aa$70fed3f0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> [Again, of general interest. Especially: Does anyone know
> of  a URI scheme registry or official list? (URI schemes
> are "http", "ftp", "mailto", etc.; the part of a URI before
> the ":".)]

Hmm. The best references I found for URIs were (taken from comments in
the stpy version of docutils/TextRE.py):

  * "An index of WWW addressing schemes",
    http://www.w3.org/Addressing/schemes.html - note
    that this is an evolving document!

  * "Regex for URLs",
    http://www.foad.org/~abigail/Perl/url2.html,
    which shows rather well both how to do it and also
    why just about noone does

The first is probably about as official as you're going to get, and
shows why I gave up on the idea of being all inclusive!

> [Tony]
> > Is ``a:b`` *really* likely to be a sensible URI, given that ``a`` is
> > entirely "local"?
>
> What do you mean by "local"?

Sorry - "relative" rather than "absolute". And I should have said ``b``,
not ``a``. And it doesn't make sense to say that for some schemes. Oh
well.

> > Should we be treating with the whole possible gamut of URIs, or
> > restricting ourselves to those most likely?
>
> There are two approaches:
>
> 1. Recognize all possible URI schemes, based on the grammar from
>    RFC2396. This has the unwanted side effect that ``a:b`` is
>    accidentally recognized as a URI. The workaround is to use inline
>    literals (not always correct: "the signal:noise ratio") or escape
>    the colon (ugly).
>
> 2, Recognize only "registered" URI schemes. Accidents like ``a:b``
>    won't happen. The disadvantage is that new URI schemes need to be
>    added to the parser. I have yet to find a definitive registry of
>    URI schemes (anybody know of one?), and I don't want to spend the
>    rest of my life adding new schemes as they pop up.
>
> Currently the reStructuredText parser takes approach #1. I wouldn't
> want to attempt #2 without an official & complete URI scheme
> reference.

It sounds, from reading the first reference above, as if it is not
possible to have an inclusive and final list of all URIs (note the
example of registering "note:" with IE so that one can browse files
using Notepad - I could instead have called it
"supercalifradgilisticexpealidocious" (?spelling) for all anyone else
can tell). So that kills proposal 2.

The third way (not that I'm recommending *it*, either), is to identify a
"common subset" of schemes that we recognise. That appears to be what
other people normally do - of course, I'm exactly the sort of person who
then comes along and wants my uncommon scheme to be added to said
"common" subset...

A fourth way (again, not necessarily one I'm advocating - but it's only
moderately yucky) would be to say "these 'common' schemes are recognised
as-is/inline, but if you want an 'odd' scheme, you need to delimit your
uri" - in the context of reST, I guess that would mean something like::

    :uri:`strange-scheme:hum-ti-hum`

(a role seems natural here, and *looks* a bit like one of the common
ways of inidicating URIs in plaintext). Then we get to play with "which
schemes are 'common'" (the "obvious" answer is
[http,ftp,file,mailto,news], but that's only for my value of obvious).

I *do* think that there might be some objection (as you say) to having
to escape colons within text in a Python context - slices are just so
important (it might not be as bad as Guido's objection to reserving "<"
and ">" as delimiters, but still pretty bad). So it *may* be that the
fourth option is our simplest bet...

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 25 11:11:34 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 25 Sep 2001 11:11:34 +0100
Subject: [Doc-SIG] Re: Document titles (was RE: [Docstring-develop]  DPS -
 possible bugs/features)
In-Reply-To: <B7D56626.17E07%goodger@users.sourceforge.net>
Message-ID: <008601c145aa$74af31c0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> [As it may be of more widespread interest, I'm moving this
> discussion over to Doc-SIG.]

Sensible - I'd forgotten that it wasn't already!

> To begin, let me reiterate what I wrote on Friday (21 Sept):
>
>     ... Perhaps this parser-specific transformation should be made
>     optional [#]_. Then you can treat everything as generic sections
>     until integration is complete.
>
>     .. [#] Along with bibliographic field list interpretation, RCS
>        keyword filtering, and whatever other conveniences we dream up.
>
> I will make these transformations optional, or remove them altogether
> and relocate them to a separate module. The only problem with this
> separation is that these transformations are markup-specific; they may
> not be necessary with another markup syntax.

I think that the proposals that David makes are sensible, and for what
it's worth are OK by me...

I actually think there are a series of "standard" processes that may
want to be run on the DPS tree, between its construction and its final
use. Some of them are optional, and most, if not all, have optional
components.

Examples are:

* Generating a document title from a title within the
  document

  (my HTML stuff actually looks first for a title on
  the "top" element of the tree, then for a title in
  the first child - I imagine that's a sensible
  algorithm if you *must* have a title)

* Sorting out automatic footnotes (since the footnote
  numbers are determined by the order of the footnotes,
  and references to them may come both before and after,
  this has to be done as a separate pass)

* Finding indirect hyperlinks, and either:

  + folding them into the document
  + grouping them together at the end of the
    "section" they occur in
  + grouping them together at the end of the
    document

  depending on what the user wants.

I think of these as "combing" the tree (think of what combing long hair
to get the tangles out!) - one does each comb until the tree is in the
state one wants it to be in.

> I've added an optional attribute 'source' (not just files!) to
> 'basic.atts', so all elements will get it. That way, each fragment of
> a document can record its origin.

Hmm - I'm already thinking of how I can use it...

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Tue Sep 25 11:38:31 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 25 Sep 2001 11:38:31 +0100
Subject: [Doc-SIG] URI schemes (was Re: [Docstring-develop] DPS - possible
 bugs/features)
In-Reply-To: <008501c145aa$70fed3f0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <008801c145ae$381ff8d0$f05aa8c0@lslp7o.int.lsl.co.uk>

I wrote:
>   * "An index of WWW addressing schemes",
>     http://www.w3.org/Addressing/schemes.html - note
>     that this is an evolving document!

I've just found the announcement of:

	http://www.w3.org/TR/2001/NOTE-uri-clarification-20010921/

in the latest "W3C weekly news" posting (there is also a pointer to

	http://www.w3.org/Addressing/

which is the "parent" for my link above).

Tibs

-- 
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"I'm a little monster, short and stout
 Here's my horns and here's my snout
 When you come a calling, hear me shout
 I will ROAR and chase you out"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Tue Sep 25 17:09:23 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 25 Sep 2001 12:09:23 -0400
Subject: [Doc-SIG] request for assistance with library reference
In-Reply-To: <20010925110743.A1226@stig.lysator.liu.se>
References: <20010925115411.B1226@stig.lysator.liu.se>
 <15279.41651.179222.3987@grendel.zope.com>
 <20010925110743.A1226@stig.lysator.liu.se>
Message-ID: <15280.44083.502341.194334@grendel.zope.com>

Kalle Svensson writes:
 > Well, after lurking on this list for a while, I guess I should do
 > something too. <wink>  I'll try to update the httplib docs.
 > I'm not a native english speaker, but I'll do my best.

  Thanks!  Don't worry about not being a native speaker; many of our
non-native speakers do a better jobs than us lazy Americans!  I'll
certainly review it and edit as necessary.

Kalle Svensson writes:
 > The httplib documentation in CVS seems older than
 > http://python.sourceforge.net/devel-docs/lib/module-httplib.html.
 > 
 > Is the source of this newer version available somewhere?  Sorry if
 > I've missed something obvious.

  Oops; I have a few minor changes in my working copy.  I'll get those
checked in in a few minutes.  Thanks for checking!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From goodger@users.sourceforge.net  Wed Sep 26 00:19:37 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 25 Sep 2001 19:19:37 -0400
Subject: [Doc-SIG] Re: Grump about field lists
In-Reply-To: <008401c145aa$6f12e270$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D68948.17E1C%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> I would be happiest, though, if the bibliographic items were grouped
> together under a single DPS tree node

OK, done. The element is 'docinfo'. It satisfies the desire for an
explicit element hierarchy (encapsulation). The relevant bits of
dps/spec/gpdi.dtd are::

    <!ELEMENT document
        ((title, subtitle?)?, docinfo?, %structure.model;)>

    <!-- Container for bibliographic elements. May not be empty. -->
    <!ELEMENT docinfo
        (((%bibliographic.elements;)+, abstract?) | abstract)>

> I don't know the Docbook DTD, and don't *quite* see why it is
> obviously bad to have "title" in two places...

Docbook allows duplicate titles (& subtitles)::

    <book>
        <title>
            Title text #1
        <bookinfo>
            <title>
                Title text #2

Makes one ask, "What's the difference between the titles? Which one do
you render?"

Being a multipurpose DTD, Docbook is very loose. And big. Sometimes
too loose. And too big. The users of Docbook have to limit themselves
to a strict subset otherwise it soon becomes unmanageable.

> Hmm - does that mean that we should add "Mode" or "Style" as an
> extra such bibliographic field name

No. Please, no!

"Mode" or "Style" should not be set from within the document. Rather,
they should be chosen when processing the document, either statically
(set by "pysource2html" script), as command-line options ("--mode book
--output html"), or understood from context somehow.

>     :Mode: HTML
>     :Style: Article

If something like this ever really becomes necessary, the way to do it
would be with a directive, not a field list.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 26 00:20:49 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 25 Sep 2001 19:20:49 -0400
Subject: [Doc-SIG] URI schemes (was Re: [Docstring-develop] DPS -
 possible bugs/features)
In-Reply-To: <008501c145aa$70fed3f0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D68990.17E1D%goodger@users.sourceforge.net>

Thanks for the references.

Tony J Ibbs (Tibs) wrote:
> The third way (not that I'm recommending *it*, either), is to
> identify a "common subset" of schemes that we recognise. ...
> 
> A fourth way (again, not necessarily one I'm advocating - but it's
> only moderately yucky) would be to say "these 'common' schemes are
> recognised as-is/inline, but if you want an 'odd' scheme, you need
> to delimit your uri" - in the context of reST, I guess that would
> mean something like::
> 
>     :uri:`strange-scheme:hum-ti-hum`

These are certainly worth further consideration. Although it's easy
for me to glibly say, "If you want something like 'a:b' in your text,
use inline literals as in ``a:b``", it would be better to avoid such
accidents in the first place. The "signal:noise" example always comes
to mind. An explicit "uri" role would be useful to have as an example,
as well.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Wed Sep 26 00:22:27 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Tue, 25 Sep 2001 19:22:27 -0400
Subject: [Doc-SIG] Re: Document titles (was RE: [Docstring-develop]
 DPS - possible bugs/features)
In-Reply-To: <008601c145aa$74af31c0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D689F2.17E1E%goodger@users.sourceforge.net>

Your examples of post-parse transformations are good. I've begun
writing them up in the spec.

Tony J Ibbs (Tibs) wrote:
> * Generating a document title from a title within the
>   document
> 
>   (my HTML stuff actually looks first for a title on
>   the "top" element of the tree, then for a title in
>   the first child - I imagine that's a sensible
>   algorithm if you *must* have a title)

But what if there's a second child with a title (i.e. multiple sibling
top-level sections)? The first section's title may not be a good or
appropriate choice for the document title, especially if it's
"Introduction".

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Wed Sep 26 10:20:14 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 26 Sep 2001 10:20:14 +0100
Subject: [Doc-SIG] Re: Document titles (was RE: [Docstring-develop] DPS -
 possible bugs/features)
In-Reply-To: <B7D689F2.17E1E%goodger@users.sourceforge.net>
Message-ID: <008e01c1466c$735afe40$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Your examples of post-parse transformations are good.
> I've begun writing them up in the spec.

<fx: blush>

> Tony J Ibbs (Tibs) wrote:
> >   (my HTML stuff actually looks first for a title on
> >   the "top" element of the tree, then for a title in
> >   the first child - I imagine that's a sensible
> >   algorithm if you *must* have a title)
>
> But what if there's a second child with a title (i.e. multiple sibling
> top-level sections)? The first section's title may not be a good or
> appropriate choice for the document title, especially if it's
> "Introduction".

Oh, indeed - it's not meant to be a *perfect* scheme, just one that
works for most documents (since *most* documents should have a title,
anyway), in the absence of my knowing anything better. My thinking, *for
HTML*, was that it was better to get a title than not (and for once I'm
assuming the author might look at what they've produced and amend it if
needs be!).

Also, whilst it is *conventional* to only have one <h1> within a
document, it is not a requirement, so one *does* get documents that have
multiple <h1> elements. We have no direct way (with my code) of
specifying that, and I don't aim to worry about it...

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Wed Sep 26 10:20:18 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Wed, 26 Sep 2001 10:20:18 +0100
Subject: [Doc-SIG] Re: Grump about field lists
In-Reply-To: <B7D68948.17E1C%goodger@users.sourceforge.net>
Message-ID: <008f01c1466c$756721f0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> OK, done. The element is 'docinfo'. It satisfies
> the desire for an explicit element hierarchy
> (encapsulation).

Thanks.

> > Hmm - does that mean that we should add "Mode" or "Style" as an
> > extra such bibliographic field name
>
> No. Please, no!

.. _above.
That's OK - there's a reason I'm happy to not be designing this myself!

> If something like this ever really becomes necessary, the way to do it
> would be with a directive, not a field list.

A good point. See above_.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From Paul.Moore@atosorigin.com  Wed Sep 26 10:21:36 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 26 Sep 2001 10:21:36 +0100
Subject: [Doc-SIG] Re: Document titles (was RE: [Docstring-develop]  D
 PS - possible bugs/features)
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B02A@UKRUX002.rundc.uk.origin-it.com>

From: David Goodger [mailto:goodger@users.sourceforge.net]
> > 1. The parser makes one up (yuck)
> > 2. The application makes one up (yuck)
> > 3. An error is generated (yuck)
> > 
> > I'd vote for 3 ...
> 
> Probably me too, in those cases where it's necessary. If the writer
> doesn't need a document title, it need not complain.

As an alternative, let's take a higher level view. Someone, somewhere, is
going to be running this stuff via command-line wrapper applications. In
that context, the obvious approach would be to have a command-line option
which specifies the document title, with "the only" section header being the
default, and with an error generated if there are no or too many sections.
So you have

    rest2html --title "My Document" doc.rest >doc.html

That seems logical to me. How it gets propogated down to the guts is for you
guys to work out :-)

Paul.


From goodger@users.sourceforge.net  Thu Sep 27 03:01:13 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 26 Sep 2001 22:01:13 -0400
Subject: [Doc-SIG] Re: [Docstring-develop] pydps futures (was RE: Document titles)
In-Reply-To: <009001c14671$2cd3eea0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D800A8.17F51%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> Combs
> -----
> The metaphor of combing through hair to remove tangles is a bit
> iffy, but I like the term (I think David calls them Filters, which
> is less obvious to me)

Strictly speaking, a filter reduces the amount of information. Rather
than "combs", I'd use the common term: "transforms".

Tony's made a good start listing the transforms we'll need. Looking at
them and thinking of others, we will need to identify which transforms
are parser-specific and which are generic. The "bibliographic field
list" transform is obviously parser-specific, but must be run after
the "section promotion" transform (what Tony calls "TitleComb").

>   David - a question or two on this. Each autonumbered
>   footnote/footnote reference has the attribute 'auto'
>   set to "1". I want to *insert* actual footnote numbers
>   into the tree.

Auto-numbered footnotes have attribute ``auto=1`` and no label.
Auto-numbered footnote_references have no reference text (they're
empty elements). If you resolve the numbering, just add a label
element to the beginning of the footnote, and reference text to the
footnote_reference. Take this input::

    References to the first ([A]_), third ([#spam]_), and second
    ([#]_) footnotes.

    .. [A] This footnote is labeled with "A".
    .. [#] This footnote is auto-numbered.
    .. [#spam] This footnote has autonumber name "spam".

Parsed (watch for empty footnote_reference elements: indentation)::

    <document>
        <paragraph>
            References to the first (
            <footnote_reference refname="a">
                A
            ), third (
            <footnote_reference auto="1" refname="spam">
            ), and second
            (
            <footnote_reference auto="1">
            ) footnotes.
        <footnote name="a">
            <label>
                A
            <paragraph>
                This footnote is labeled with "A".
        <footnote auto="1">
            <paragraph>
                This footnote is auto-numbered.
        <footnote auto="1" name="spam">
            <paragraph>
                This footnote has autonumber name "spam".

Only the first footnote_reference contains reference text. After
auto-numbering resolution, the tree should become::

    <document>
        <paragraph>
            References to the first (
            <footnote_reference refname="a">
                A
            ), third (
            <footnote_reference auto="1" refname="spam">
                2
            ), and second
            (
            <footnote_reference auto="1" refname="_footnote 1">
                1
            ) footnotes.
        <footnote name="a">
            <label>
                A
            <paragraph>
                This footnote is labeled with "A".
        <footnote auto="1" name="_footnote 1">
            <label>
                1
            <paragraph>
                This footnote is auto-numbered.
        <footnote auto="1" name="spam">
            <label>
                2
            <paragraph>
                This footnote has autonumber name "spam".

The labels and reference text are added to the two auto-numbered
footnotes & footnote_references. The unnamed auto-numbered footnote
& reference need name & refname attributes. Let's use "_footnote " +
footnote number for those attributes (a name-mangling unlikely to
occur in the real world; note that this hasn't been documented yet).
Of course (!), the implicitlinks and refnames instance attributes of
the dps.nodes.document instance must be updated. (It will soon be my
pleasure to document the dps/nodes.py data structure, since I'm
gradually forgetting its details.)

After adding labels and reference text, the "auto" attributes can be
ignored.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Thu Sep 27 03:03:02 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 26 Sep 2001 22:03:02 -0400
Subject: [Doc-SIG] Re: [Docstring-develop] pydps - new version uploaded
In-Reply-To: <009201c14679$2fef8ce0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D80115.17F52%goodger@users.sourceforge.net>

Tony J Ibbs (Tibs) wrote:
> http://www.tibsnjoan.co.uk/reST/pydps.tgz
...
> The package now contains a README.txt file at David's request

Thanks!

> (yes, it's a text file written in reST, despite my grumbles at David
> the other day about file extensions).

I think ".rtxt" is the winner, by the way.

> All this means I can now turn
> restructuredtext/spec/reStructuredText.txt into HTML and produce
> something that I find easier to read. Yeh.

I ran pydps.py on reStructuredText.txt and took a look. Wow! Very cool
to see the text come to life.

I had some trouble running it over some Python modules though
(tracebacks sent separately). I didn't see any unit test code in the
distro...

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Thu Sep 27 10:09:02 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Thu, 27 Sep 2001 10:09:02 +0100
Subject: [Doc-SIG] Re: [Docstring-develop] pydps - new version uploaded
In-Reply-To: <B7D80115.17F52%goodger@users.sourceforge.net>
Message-ID: <009b01c14734$0d290c70$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote (in response to me, earlier):
> > (yes, it's a text file written in reST, despite my grumbles at David
> > the other day about file extensions).
>
> I think ".rtxt" is the winner, by the way.

Yes, I think so - it *looks* enough like ".txt", and is short enough.

> I ran pydps.py on reStructuredText.txt and took a look. Wow! Very cool
> to see the text come to life.

I get a thrill from it too! And I know how little it's doing... (it
makes it a lot easier to *read* the documentation, as well, I find).

> I had some trouble running it over some Python modules though
> (tracebacks sent separately).

Thanks. All such are good grist for the mill.

> I didn't see any unit test code in the distro...

I know. I still need to get to grips with such matters - I *like* it
when I can test stuff before writing it (so to speak) - but
unfortunately the tests to write are rather scary, since I find it hard
to think of *small* things that need testing.

It's a bullet that shall have to be bitten, though, at some stage,
because I assume that code won't be allowed into docutils (as will be)
until it can test itself.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
.. "equal" really means "in some sense the same, but maybe not
.. the sense you were hoping for", or, more succinctly, "is
.. confused with". (Gordon McMillan, Python list, Apr 1998)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Thu Sep 27 10:09:04 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Thu, 27 Sep 2001 10:09:04 +0100
Subject: [Doc-SIG] Re: [Docstring-develop] pydps futures (was RE: Document
 titles)
In-Reply-To: <B7D800A8.17F51%goodger@users.sourceforge.net>
Message-ID: <009c01c14734$0e592800$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Strictly speaking, a filter reduces the amount of information. Rather
> than "combs", I'd use the common term: "transforms".

I suspected that "comb" wouldn't last (well, except maybe in code).

But then, I never managed to get the term "thing" accepted in the
geographic information standards community (which has too many meanings
for each of the terms entity, feature and object, and desparately needs
something else!).

> Tony's made a good start listing the transforms we'll need.

And, as I discovered reading the latest version of the
reStructuredText.txt document last night, David has already documented
some as well.

> Auto-numbered footnotes have attribute ``auto=1`` and no label.

Yes.

> Auto-numbered footnote_references have no reference text (they're
> empty elements). If you resolve the numbering, just add a label
> element to the beginning of the footnote, and reference text to the
> footnote_reference.

...example snipped...

Hah - yes, in the context of a tree transformation, the obvious thing to
do *is* a tree transformation - I'll hope I'd have realised that when
going from the HTML producing code to a tree manipulation algorithm.
Hah.

> Let's use "_footnote " + footnote number for those attributes
> (a name-mangling unlikely to occur in the real world; note that
> this hasn't been documented yet).

Hmm - I wasn't assuming that unnamed autonumbered footnotes would have
any name other than the number (since you're following the XML tradition
and storing such things as strings anyway).

>(It will soon be my pleasure to document the dps/nodes.py data
> structure, since I'm gradually forgetting its details.)

Whereas I'm gradually discovering them - good, that will save me a job
(I shall, however, read the documentation as you produce it and
comment).

> After adding labels and reference text, the "auto" attributes can be
> ignored.

Or even removed? or no, that wouldn't allow you to backtransform into
reST again...

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Well we're safe now....thank God we're in a bowling alley.
- Big Bob (J.T. Walsh) in "Pleasantville"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Thu Sep 27 22:09:33 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 27 Sep 2001 17:09:33 -0400
Subject: [Doc-SIG] Python 2.2a4 docs frozen
Message-ID: <15283.38285.231035.778383@grendel.zope.com>

  The documentation for Python 2.2a4 is now frozen; the HTML packages
have been pushed to the server and the online version is available on
python.org from the Python 2.2 pages.  SourceForge will be updated
momentarily.
  The trunk is *not* frozen; updates can continue there.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Thu Sep 27 22:18:17 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Thu, 27 Sep 2001 17:18:17 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010927211817.5C90128694@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Documentation as released for Python 2.2 alpha 4.


From goodger@users.sourceforge.net  Fri Sep 28 04:45:25 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 27 Sep 2001 23:45:25 -0400
Subject: [Doc-SIG] examples in reStructuredText spec
Message-ID: <B7D96A94.17FFE%goodger@users.sourceforge.net>

Call for opinions re:

    http://structuredtext.sourceforge.net/spec/reStructuredText.txt

- Are the "syntax diagrams" useful?

- Would "before & after" examples be useful (also/instead)? For example::

      Example input::

          A Title
          =======
          Paragraph.

      Parsed::

          <document>
              <section name="a title">
                  A Title
              <paragraph>
                  Paragraph.

  Examples such as this would extend the spec's length quite a bit (they
  could be put into another file, examples.txt). However, I think they
  would help to document the DPS document tree structure. I would include
  a section explaining the indented pseudo-XML notation used to represent
  the parsed document tree.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From goodger@users.sourceforge.net  Fri Sep 28 04:55:49 2001
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 27 Sep 2001 23:55:49 -0400
Subject: [Doc-SIG] auto-numbered footnote resolution
In-Reply-To: <009c01c14734$0e592800$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <B7D96D04.17FFF%goodger@users.sourceforge.net>

[David]
> Let's use "_footnote " + footnote number for those attributes
> (a name-mangling unlikely to occur in the real world; note that
> this hasn't been documented yet).

[Tony]
> Hmm - I wasn't assuming that unnamed autonumbered footnotes would have any
> name other than the number (since you're following the XML tradition and
> storing such things as strings anyway).

I suppose we could go either way. I'm re-examining (for validity) what I
wrote in the spec:

    Automatic footnote numbering may not be mixed with manual footnote
    numbering; it would cause numbering and referencing conflicts.

Would such mixing inevitably cause conflicts? We could probably work around
potential conflicts with a decent algorithm. Should we? Requires thought.
Opinions?

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net


From tony@lsl.co.uk  Fri Sep 28 10:24:27 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Fri, 28 Sep 2001 10:24:27 +0100
Subject: [Doc-SIG] auto-numbered footnote resolution
In-Reply-To: <B7D96D04.17FFF%goodger@users.sourceforge.net>
Message-ID: <00aa01c147ff$5e92b4f0$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> I'm re-examining (for validity) what I wrote in the spec:
>
>  Automatic footnote numbering may not be mixed with manual footnote
>  numbering; it would cause numbering and referencing conflicts.
>
> Would such mixing inevitably cause conflicts? We could
> probably work around potential conflicts with a decent
> algorithm. Should we?

Well, I read that paragraph in the documentation, and decided that it
was in the category of "don't, in practice, care" so far as I was
concerned. This is the same category I put the forbidding of nested
inline markup - quite clearly one *can* do it, but equally clearly it's
a pain to implement, and not a terribly great gain, all things
considered.

It's a category with the subtext "examine for correctness after we've
had some experience of people *using* reST in the wild".

Thus, given there are lots of other things to do, I would tend to leave
it as-is (especially if you are able to *warn* people about it if they
do it by mistake).

To my mind, being able to do ``[#thing]_`` probably give people enough
precision over footnotes whils still allowing autonumbering - the *only*
potential problem is when referring to a footnote in a different
document (and that, again, is something I would leave fallow for the
moment, although we know I tend to want to use roles as annotation for
that sort of thing).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
2 wheels good + 2 wheels good = 4 wheels good?
3 wheels good + 2 wheels good = 5 wheels better?
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From tony@lsl.co.uk  Fri Sep 28 10:24:28 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Fri, 28 Sep 2001 10:24:28 +0100
Subject: [Doc-SIG] examples in reStructuredText spec
In-Reply-To: <B7D96A94.17FFE%goodger@users.sourceforge.net>
Message-ID: <00ab01c147ff$5f68a290$f05aa8c0@lslp7o.int.lsl.co.uk>

David Goodger wrote:
> Call for opinions re:
>
>     http://structuredtext.sourceforge.net/spec/reStructuredText.txt
>
> - Are the "syntax diagrams" useful?

I don't think I use them particularly (and, of course, I find the first,
big, one to be rather confusing) - but I also am probably not the best
authority on the usefulness of such things (I've been accused in the
past of a, well, odd take on how language works).

> - Would "before & after" examples be useful (also/instead)?
>   Examples such as this would extend the spec's length quite
>   a bit (they could be put into another file, examples.txt).

Ah, well, as to that.

As yet, pydps has nothing in the way of selftest. By preference, I like
doctest as a means of testing stuff. But, aha, reST documents can
*embed* doctest blocks, just like docstrings can. So I was intending to
produce a "semi-literate error testing" ability to pydps.py, allowing it
to read in a .rtxt file and run doctest over it (either over the whole
thing, allowing doctest to detect the blocks of interest, or more likely
just over the doctest blocks themselves).

*Now*, if we have a reStructuredTextExamples.rtxt file (well, called
something shorter!) containing examples of text and result, I don't see
why we shouldn't leverage off the same sort of thing.

This would also allow us to discuss abstruse corners of the parser,
things to avoid or know about, whilst testing that they do indeed work
as "intended".

Howevever, if this is intended as examples for human use, it *might* be
better to have a directive defined that simply takes two parts (David,
you already know I'm bad at directive design, so please feel free to
read the *intent*, not the words!), or perhaps better two (paired)
directives - for instance::

    .. Example::
       This is some *emphasised* text.
    .. Gives::
       <paragraph>
          This is some
          <emphasis>
             emphasised
          text.

I think *this* could be a very valuable thing.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From fdrake@acm.org  Fri Sep 28 23:03:18 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Fri, 28 Sep 2001 18:03:18 -0400 (EDT)
Subject: [Doc-SIG] [development doc updates]
Message-ID: <20010928220318.14E2C28697@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Various small adjustments and bug fixes.
Added preliminary docs for the SimpleXMLRPCServer module.