In response to a request for a new interpreted text role, I recently wrote, I don't want to let in all kinds of inline elements with marginal uses. I think this needs further discussion. In `The Chicago Manual of Style`, the section "Distinctive Treatment of Words" (6.62 through 6.91 in 14th ed.) lists many cases: * emphasis * foreign words * special terminology & technical terms * words used as words * letters as letters * musical dynamics (pianissimo as italic "pp" etc.) * letters indicating ryme schemes (as "aabba" for a limerick) All of these are mapped to italics. Should we have roles for each of them? Even if we combine closely-related cases (words as words & letters as letters; musical dynamics as a case of foreign words), we have 4 or 5 cases here. DocBook has dozens more inline elements. How far should we go? Apart from the purely-functional markup (hyperlink-related, substitutions), we have 4 types of inline markup: ``inline literals`` *emphasis* **strong** `interpreted text` *Emphasis* and **strong** are probably the most common inline markup used; they're also the most vaguely-defined. They're typically (but not always!) mapped as emphasis -> italic, and strong -> boldface. Should we have a slew of inline elements, with interpreted text roles mapping to them? The advantage is that the Docutils doc model becomes very rich, allowing fine distinctions of nuance. The disadvantage is code bloat: a lot more elements the Writers have to handle. If we want to set a limit, where? The to-do list has this item: add a runtime setting (directive and/or command-line option) to set the default role of interpreted text. I.E., map "`" to something. Should we have a directive to map other inline markup (i.e., "*" & "**", maybe even "``") to arbitrary inline element types? There are two sides to be considered: the reStructuredText markup, and the Docutils document model. Currently they can be considered two aspects of one system, but in the future there may be more markup languages supported. The reStructuredText markup is merely an interface to the document model, and the document model shouldn't pander to the markup too much. Comments? -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv
David Goodger wrote:
In `The Chicago Manual of Style`, the section "Distinctive Treatment of Words" (6.62 through 6.91 in 14th ed.) lists many cases:
* emphasis * foreign words * special terminology & technical terms * words used as words * letters as letters * musical dynamics (pianissimo as italic "pp" etc.) * letters indicating ryme schemes (as "aabba" for a limerick)
All of these are mapped to italics. Should we have roles for each of them? Even if we combine closely-related cases (words as words & letters as letters; musical dynamics as a case of foreign words), we have 4 or 5 cases here. DocBook has dozens more inline elements. How far should we go?
I'm thinking that following HTML's model may make sense: in addition to *emphasis* and **strong**, have `italics`:i: and `bold`:b:. So if there is no semantic class matching what you need, you just use one of these two. Another option would be being able to declare your own interpreted text roles and assigning them a class: Then, the writers would handle all of these in the same way, and the decision on how to render them would be a stylesheet issue. E.g., :: .. role:: foreign f This is the `de facto`:f: standard. And in a CSS stylesheet:: .foreign { font-style: italic }
The to-do list has this item: add a runtime setting (directive and/or command-line option) to set the default role of interpreted text. I.E., map "`" to something. Should we have a directive to map other inline markup (i.e., "*" & "**", maybe even "``") to arbitrary inline element types?
This doesn't make sense to me; when I read a reST source, I parse the asterisks and backquotes in a certain way, and if a text would arbitrarily redefine them, the benefit would be lost. -1... - Benja
On 2003-02-03, Benja Fallenstein wrote:
David Goodger wrote:
The to-do list has this item: add a runtime setting (directive and/or command-line option) to set the default role of interpreted text. I.E., map "`" to something. Should we have a directive to map other inline markup (i.e., "*" & "**", maybe even "``") to arbitrary inline element types?
This doesn't make sense to me; when I read a reST source, I parse the asterisks and backquotes in a certain way, and if a text would arbitrarily redefine them, the benefit would be lost. -1...
I don't think that "*", "**", "``" should be re-definable but they should boil down to fixed interpreted text roles. Simplifies the internal document model. [If that's already the case, excuse my ignorance :-]. -- Beni Cherniavsky <cben@tx.technion.ac.il> Gigga incognita :-)
Benja Fallenstein wrote:
I'm thinking that following HTML's model may make sense: in addition to *emphasis* and **strong**, have `italics`:i: and `bold`:b:. So if there is no semantic class matching what you need, you just use one of these two.
I'm reluctant to enable explicit italic & bold roles. Even in HTML, their use is discouraged (although not deprecated... yet).
Another option would be being able to declare your own interpreted text roles and assigning them a class: Then, the writers would handle all of these in the same way, and the decision on how to render them would be a stylesheet issue. E.g., ::
.. role:: foreign f
This is the `de facto`:f: standard.
And in a CSS stylesheet::
.foreign { font-style: italic }
That might be OK for HTML, but wouldn't apply to other formats. And how would it be represented in the intermediate data structure (the doctree)? As ``<phrase class="foreign">``? That's the equivalent of ``<interpreted role="foreign">``, which was dropped in favour of concrete elements (e.g., ``<foreign>``). It might be feasible to declare a new role which is a modification of an existing role with a "class" attribute. Something like:: .. role:: foreign f :base: emphasis :class: foreign This would result in ``<emphasis class="foreign">`` when used. It would need a good use case though. And it would still be problematic for non-HTML output. -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv
On 2003-02-04, David Goodger wrote:
Benja Fallenstein wrote:
I'm thinking that following HTML's model may make sense: in addition to *emphasis* and **strong**, have `italics`:i: and `bold`:b:. So if there is no semantic class matching what you need, you just use one of these two.
I'm reluctant to enable explicit italic & bold roles. Even in HTML, their use is discouraged (although not deprecated... yet).
IIRC, XHTML 2 does this bold step. No physicall formating provided (CSS can do it all anyway). -- Beni Cherniavsky <cben@tx.technion.ac.il> If somebody builds a time machine he can gateway the Internet to itself with a time offset. I wonder what implications that would have...
Beni Cherniavsky writes:
IIRC, XHTML 2 does this bold step. No physicall formating provided (CSS can do it all anyway).
Yes, XHTML 2 looks like an improvement (finally). Unfortunately, the weight of the installed base of browsers is enough that some of us have pretty much given up on anything derived from HTML. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
On Monday, February 3, 2003, at 02:07 PM, David Goodger wrote:
* emphasis * foreign words * special terminology & technical terms * words used as words * letters as letters * musical dynamics (pianissimo as italic "pp" etc.) * letters indicating ryme schemes (as "aabba" for a limerick)
All of these are mapped to italics. Should we have roles for each of them? Even if we combine closely-related cases (words as words & letters as letters; musical dynamics as a case of foreign words), we have 4 or 5 cases here. DocBook has dozens more inline elements. How far should we go?
I think there should be some plan in place to add extra types, but only add them as people request them. The richer the formatting, the less likely different people's markup will match each other -- in one place someone might use `pp`:music:, where another just uses *pp*, etc. The lack of semantic markup in the current version (I certainly think of *emphasis* and **strong** more in terms of their rendering and physical look than any semantics) makes this less of a problem, because semantics are highly ambiguous while rendered look is concrete. But if `something`:type: is valid for any "type", then I suppose it doesn't matter, so long as the output format has some way of identifying the proper styling. As I think about it though, it's non-trivial to effect any output but HTML. Anyway, in summary: just because you *can* identify a semantic classification doesn't mean you should. I seldom see the benefit, and before introducing more complexity into the system there should be a concrete reason someone wants to do so. E.g., they want to mark glossary terms for later compilation -- a very concrete desire. But if it is more work to restrict the kinds of semantic inline markups then to allow arbitrary semantics, then perhaps arbitrary semantics make more sense. In which case perhaps there should be a directive to give rendering hints (and hopefully definition hints!) in the document itself, as otherwise the document won't be portable. Ian
Ian Bicking wrote:
I think there should be some plan in place to add extra types, but only add them as people request them. ... Anyway, in summary: just because you *can* identify a semantic classification doesn't mean you should. I seldom see the benefit, and before introducing more complexity into the system there should be a concrete reason someone wants to do so. E.g., they want to mark glossary terms for later compilation -- a very concrete desire.
That's reasonable. But what I'm trying to establish is where to draw the line? How much demand is enough to allow a new role in? It's been up to my judgement so far. Unless I hear some compelling arguments otherwise, I suppose it will remain that way.
But if `something`:type: is valid for any "type",
It's not. "type" has to be one of a pre-defined set of roles for which there is parser and doctree support. Each role will have an associated method or function that understands the role's semantics.
then I suppose it doesn't matter, so long as the output format has some way of identifying the proper styling. As I think about it though, it's non-trivial to effect any output but HTML.
("Effect" or "affect"? Completely changes the meaning of the last sentence.)
But if it is more work to restrict the kinds of semantic inline markups then to allow arbitrary semantics, then perhaps arbitrary semantics make more sense. In which case perhaps there should be a directive to give rendering hints (and hopefully definition hints!) in the document itself, as otherwise the document won't be portable.
There won't be arbitrary semantics, and there's no need or room for rendering hints in the markup. That's really basic: keep the style separate from the structure. -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv
David Goodger wrote:
But if `something`:type: is valid for any "type",
It's not. "type" has to be one of a pre-defined set of roles for which there is parser and doctree support. Each role will have an associated method or function that understands the role's semantics.
I'm confused. There seems to be parser and doctree support, since it winds up in the pseudo-XML as:: <interpreted position="prefix" role="role"> Can't the role be any string consisting of [a-zA-Z0-9_.-]+ ? --Mark
Mark Nodine wrote:
David Goodger wrote:
It's not. "type" has to be one of a pre-defined set of roles for which there is parser and doctree support. Each role will have an associated method or function that understands the role's semantics.
I'm confused. There seems to be parser and doctree support, since it winds up in the pseudo-XML as::
<interpreted position="prefix" role="role">
Can't the role be any string consisting of [a-zA-Z0-9_.-]+ ?
You must be looking at old code. That was never intended to be the final implementation, just a temporary placeholder. The spec & code were revised about a month ago. See my Jan 9 post, 'Docutils update: "interpreted text" reimplemented'. -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv
On 2003-02-04, David Goodger wrote:
Ian Bicking wrote:
I think there should be some plan in place to add extra types, but only add them as people request them. ... Anyway, in summary: just because you *can* identify a semantic classification doesn't mean you should. I seldom see the benefit, and before introducing more complexity into the system there should be a concrete reason someone wants to do so. E.g., they want to mark glossary terms for later compilation -- a very concrete desire.
That's reasonable. But what I'm trying to establish is where to draw the line? How much demand is enough to allow a new role in? It's been up to my judgement so far. Unless I hear some compelling arguments otherwise, I suppose it will remain that way.
You can draw quite a good line for the commonly needed roles. The ones you throw out are not appropriate for the common implementation but are specific people will want them for themselves. Any big document / group of documents has roles specific to it.
But if it is more work to restrict the kinds of semantic inline markups then to allow arbitrary semantics, then perhaps arbitrary semantics make more sense. In which case perhaps there should be a directive to give rendering hints (and hopefully definition hints!) in the document itself, as otherwise the document won't be portable.
There won't be arbitrary semantics, and there's no need or room for rendering hints in the markup. That's really basic: keep the style separate from the structure.
Great idea. But for doing it, you must be able to define your own classes of structure. Think of it: would CSS be useful without classes and ids? -- Beni Cherniavsky <cben@tx.technion.ac.il> If somebody builds a time machine he can gateway the Internet to itself with a time offset. I wonder what implications that would have...
David Goodger <goodger@python.org> writes:
* emphasis * foreign words * special terminology & technical terms * words used as words * letters as letters * musical dynamics (pianissimo as italic "pp" etc.) * letters indicating ryme schemes (as "aabba" for a limerick)
All of these are mapped to italics. Should we have roles for each of them? Even if we combine closely-related cases (words as words & letters as letters; musical dynamics as a case of foreign words), we have 4 or 5 cases here. DocBook has dozens more inline elements. How far should we go?
-1 Of all the possible roles above only terminology (and/or acronyms) would seem to potentially profit from being specifically semantically tagged as such. Who is going to write a reST document in which he needs to process "musical dynamics" or "letters indicating ryme schemes" separately from emphasized text? If the need really arises then the user can always specify his own role tags.
The to-do list has this item: add a runtime setting (directive and/or command-line option) to set the default role of interpreted text. I.E., map "`" to something. Should we have a directive to map other inline markup (i.e., "*" & "**", maybe even "``") to arbitrary inline element types?
+1 Obviously these should only be remapped to related element types (i.e. self-defined ones for special applications), but since the number of available lightweight markup elements is severely limited and readability of reST documents is an important goal, allowing the user the option to leverage these for his own purposes seems like a good thing to me. alex
participants (7)
-
Alexander Schmolck -
Beni Cherniavsky -
Benja Fallenstein -
David Goodger -
Fred L. Drake, Jr. -
Ian Bicking -
Mark Nodine