[Doc-SIG] DPS components

Thu, 20 Sep 2001 10:28:01 +0100

David Goodger wrote:
> I forgot all about the fifth component: "filers" (formerly "output
> management").

Hmm - I don't think I'd bother to separate them out at all, personally.

> [Tony]
> > Of course, the "reader" for plain .rst files is *ever so* simple!
>
> It is the simplest reader, but it still has to do some work, like
> hyperlink resolution & footnote numbering. On my to-do-soon list.

As obviously Garth also did, I was assuming that the "reader" simply
read in the data and then passed (the DPS bits of it) on to the parser
to make sense of - i.e.::

   +--------+         +------------+      +--------+
   | reader | ------> | synthesist | ---> | writer |
   +--------+ \     > +------------+      +--------+
               \   /
                v /
            +--------+
            | parser |
            +--------+

That's why I said the reader was so simple for plain DPS texts - it has
nothing to do but actually read in the data and pass it to the parser -
nothing goes along that top route in that model.

Your diagram::

>
>            +--------+                +-------+
>            | reader | -------------> | filer |
>            +--------+                +-------+
>              /   \\                      |
>             /     \\                     |
>            /       \\                    |
>     +--------+    +------------+     +--------+
>     | parser |    | synthesist |     | writer |
>     +--------+    +------------+     +--------+

is a bit different in form, of course - the "reader" is more powerful.
On the whole, apart from the fact I'd elide the "filer", I think your
diagram is better.

> Having just scribbled a diagram [now above], I think the
> "synthesist" (transformer/designer) is so tightly coupled to the
> reader that it becomes an internal implementation detail::

Hmm. Maybe. But (unless your parser is going to do it all for me) don't
forget that there are still transforms to be done on the DPS node tree
for "pure" reST texts - handling references, for instance. So either we
need to be able to pass down multiple synthesisers, or (more likely) one
synthesiser needs to be able to invoke others.

> I can see a variety of synthesists for Python source readers, but
> other input types (.rtxt file, PEP, email, etc.) won't need them.

OK - maybe you are doing all the work.

*But* what about a synthesist component to (for instance) generate a
contents list? In some cases that would be done via a directive (i.e.,
the author decides to "provide" it), but in other cases the person
transforming the document may decide that they want a contents
inserting. That sounds like a synthesis job to me, not a writer job...

> [Tony]
> > I assume that you are thus hinting that the output of the designer
> > should be "pure" DPS nodes - that is, using only DPS tree nodes that
> > are defined in dps/nodes.py, so that *any* DPS writer can be slotted
> > in.
>
> Yes, exactly. A generic document is produced and handed over to the
> filer & writer.

I've got the refactoring bug now, I'm afraid, 'cos that idea is becoming
increasingly attractive. I may give in and make the next thing I work on
a refactoring of pydps so that it *does* generate a "pure" DPS node tree
as the input to the writer. It would be a useful clarification of
concept, I think. And interesting to see how easy it is for me to still
generate my, erm, colourful HTML.

> > The example I would use to think about the flow through the system
> > would be that of a simple table in the quick reference, which could
> > use a directive::
> >
> >     .. quickreftable:: Directives
> >        :link: http://link-to-text
> >        ::
> >
> >          For instance:
> >
> >            .. graphic:: images/ball1.gif
>
> I would rewrite that as::
>
>     .. quickreftable:: Directives (http://link-to-text)
>
>        For instance:
>
>        .. image:: images/ball1.gif

Ah - but the above doesn't fail as gracefully (imagine if I want to be
able to have invalid constructs in my example, and the poor person
trying to format my text doesn't have the right plugin). Also, whilst I
thought about folding the link in as you've done, I didn't, to allow for
a link in parentheses to appear in the title, if I so wished (yep, being
awkward again). (in fact, my *first* writing of the directive was
essentially identical to yours)

> > Now, given I want the table header to be in pale blue, with the word
> > "Directives" in strong italics, and I want the table body to be
> > split 50/50 between the two columns, with a pale yellow background,
> > *if* I'm outputting to HTML - how do I do that? Bearing in mind that
> > if I'm outputting to PDF, I want an entirely different set of
> > details.
>
> Style sheets would be useful for that. HTML has them, and a PDF
> generator might too. Or writers might have their own collections of
> style modules.

I thought I mentioned CSS. Although there is still a serious requirement
to be able to cope with older browsers that don't support such. I had
imagined that one might have writers having a variety of options on how
to treat styles - some producing CSS directives, others embedded HTML,
others being very simple for use by the visually impaired, etc.

> [Garth]
> > Is the intent something along the lines of the following? ::
> >
> >     Writer.write(Transformer.transform(Parser.parse(Reader())))
>
> Perhaps more like::
>
>     Filer.file(Reader.read(inputref, Parser, Synthesist), Writer)

You're both *way* too concise! What's wrong with some well named
intermediate variables, and some comments!

Given I still don't see the need for a seperate filer (so I'll ignore it
for now), I would see that as being shown to the masses as more like::

    # Reader takes an input stream. An alternative
    # might be FileReader, which takes a filename...
    reader = docutils.pysource.Reader()
    parser = docutils.parser.reST()
    # As you say above, the synthesiser might be
    # "assumed" by the reader in the default case...
    synthesiser = docutils.pysource.Synthesiser()

    reader.language = "en"
    synthesiser.style = "fancy"
    synthesiser.use_tables = 0

    instream = open("c:/reST/example.rtxt")
    try:
        document = reader(instream,parser,synthesiser)
    finally:
        instream.close()

    writer = docutils.writer.HTML()
    # Output as a single file.
    writer(document,file="c:/HTML/example.html")

    # Output as a directory structure.
    # Split out pages at header level 2.
    # (a similar facility in a TeX writer would
    # allow us to do slides...)
    writer(document,directory="c:/HTML/example/",
           index="index.htm",splitlevel=2)

Ah - I've worked out why I don't see the need for Filer now.

Looking at your four examples, the first two (single file or multi-file)
are likely candidates for HTML output (for instance) - but if so, the
HTML writer needs to know what it is doing, it can't be left up to an
external "body" to do it (since the HTML writer needs to insert
appropriate links, generate an index file, etc., in the multiple file
case).

Similarly, it seems to me that the production of tree structures in
memory would be the result of a specific writer (for instance, a DOM
writer - hmm, odd concept). In fact, I'm not even sure I'd call that a
"Writer" - I think I'd think of it as an output synthesiser(!).

Anyway, regardless of details, I still think we're making Big Steps in
understanding the problem (and you might convince me about Filers yet, I
suppose!).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)