[Doc-SIG] reStructuredText-to-HTML XSLT

David Goodger goodger@users.sourceforge.net
Wed, 17 Oct 2001 00:14:02 -0400


Thanks Alan. I'll put your XSLT file into the sandbox shortly. I'm falling
behind lately: I also have Paul Wright's XSLT file and Ueli Schlaepfer's
code to add to the projects. Sorry; I'll try to catch up quickly.

Alan Jaffray wrote:
> It does a pretty good job with the exception of tables (which I haven't
> attempted to handle) and some lingering gratuitous whitespace (which
> is contained from rST's output, and can't be safely stripped).

I'm going to experiment with xml.dom.minidom to try to get better XML output
wrt whitespace. The trouble is that newlines are interpreted as spaces in
mixed content, so I've got to figure out a way to avoid newlines before &
after inline markup. We may have to give up newlines between body elements
also though.

> (It also leaves footnotes in the middle of the document; do we want them
> all at the end?  I suppose so.  Hadn't thought about it.)

I think there will be several choices, but the tree transforms should take
care of it.

> I'm not sure about the desired behavior for text which contains characters
> which have to be escaped in the XML.  Leave them escaped,

Yes.

> or unescape them
> so the user can intersperse HTML markup with rST markup?

That way lies markup-dependence.

> I chose the worst
> of both worlds by co-opting the "interpreted" tag to mean "contains HTML
> markup, should not be escaped when output to HTML".  Ideas welcome.

Tony J Ibbs (Tibs) wrote:
> The *sole* case I might give in on would be if the user had access to an
> appropriate role - for instance::
> 
>     :html:`<hr>`
>     :xml:`<myfavouritetag andits="data" />`

Tony's suggestion is the way to go, for arbitrary markup.

> However. We have discussed *modes* for DPS/reST (was that the term?),
> where obviously "Python" is one such, and "book" might be another. I
> have suggested in the past that maybe "HTML" would be useful,

Might I suggest "web page" instead of "HTML" for the name of the Reader?
("Reader" is my currently favoured term for "input mode", although it may
not fit this case.)

> So maybe in that one case
> one might want to relax the rules to allow "interpreted" text to work as
> you do. However, I think one would probably want a directive in the
> document to state this (David - is that right?) so that people would
> know that this document was not a "general" document, but targetted at a
> specific output form.

Yes, a directive would be the way to change processing conditions, if
necessary. Such "pragma" directives shouldn't be overused though. If a
particular pragma directive becomes common, it's probably time to rethink
the approach.

> This *only* works[1]_

A small nit here -- you need a space before the footnote reference.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net