[Doc-SIG] using the same delimiter on the left and right..

Tony J Ibbs (Tibs) tony@lsl.co.uk
Thu, 29 Mar 2001 10:16:48 +0100

Edward D. Loper wrote:
> Tibs seems to have this strange notion that "real" markup
> languages don't use the same characters for left and right
> delimiters.. :)

Erm, yes, now you point it out I clearly had my stupid hat on (I wish I
could lose that damn thing, it's so embarrassing).

> I think that what makes delimiters
> in ST seem like not-real-markup is that they are context-
> dependant.  E.g., "'" in the middle of a word is different
> from "'" at the beginning of a word.

Well, personally (despite anything I might have said before now) I'm
going to start declaring that the ST family *is* real markup (thus
defining "real" appropriately, of course). I've begun to think that
otherwise it sounds silly.

(some of you may want to skip the rant that follows to see more
interesting stuff - I'll put a '****** BACK TO NORMAL ******' delimiter
at the end so you can scan down...)

What I'm clearly striving after is some way of describing what makes ST,
etc., different.

Thinking about this, we have:

* The SGML/XML family
* The TeX family
* The Runoff family (including things like
* The Pod family

OK. The SGML family originated in a need to markup data as to its
*meaning*, pure and simple. This later got spread to trying to use the
meaning of a term to decide how to present it (which gives us HTML, sort
of), and that becomes a slippery slope.

The TeX family originate in the need to drive the precise typesetting of
particular parts of the text, whilst producing good general, predictable
typesetting for the rest of the text. It is important to remember that
when using a TeX-related tool, the *intention* is that if it doesn't
look good when formatted, then it should be rewritten (and indeed, that
may mean writing different words to say the same thing). Because the
meaning of a term often drives how it is to be typeset (especially in
maths, it's original target), the use of TeX for semantic markup arises.

The Runoff family was a simpler variant on the TeX idea, which wanted to
produce computer manuals, and so on. There's generally less control over
meaning, more interest in presentation. It's not clear to me if troff
and so on belong to the TeX family or the Runoff family.

The Pod family is, maybe, if it exists, the family of marking up
docstrings. Edward Welbourne has talked about this in an earlier email.
Basically, the aim is to produce something more useful than plain text
(but not of a quality to stop a technical documentor wincing), leaving
the original, marked up, text still useful *as such*. Eddy also comments
that if someone using (in his comment, ST) is spending too much time
worrying about markup, then they're not spending enough time working on
more important things.

Both the TeX family and the SGML family care about formalisms, a lot.
They each have their own elegances which they are striving for. The
Runoff family hasn't *heard* of elegance. And the Pod family are after

I think *we* are *not* in the TeX or SGML families. We are in the
"pragmatic solution to a specific problem" space, and if formalism helps
with that, then that's a Good Thing, but we shouldn't strain after
theoretical purity lest we stray from practical usefulness (heh, I've
been pulled up on the list in the past for exactly that).

Sorry - back to the normal argument again...

****** BACK TO NORMAL ******

> Let's make all of our delimiters into real delimiters,
> that can only be used for delimiting (or maybe also for
> bullets, in the case of '*').  We could switch our "literal"
> delimiter to "`".  So then we would have the following
> reserved characters, that may not appear in text without
> being quoted somehow:
>     '<'    left delimiter for URLs
>     '>'    right delmiter for URLs
>     '#'    delimiter for inlines
>     '`'    delimiter for literals
>     '*'    delimiter for emph, maybe for strong.
>     '::'   marker for literal regions

I hadn't thought of using backtick as a literal delimiter. In the
context of docstrings, I can't see why it wouldn't work - hmm, this is a
`literal` - yep, that works for me (does the resonance with Python
backtick work?). It frees up both sorts of "normal" quote, which is
good, and only inconveniences people like Eddy who insist on typing
`both sorts of single quote' (TeX users, the lot of them). And it mean I
can type "'cos" without worrying (or 'plane or 'phone if I want to
appear old-fashioned).

And if those *are* the delimiters, then it *would* work to expect them
to be quoted when they occurred - neat. Just goes to prove why we keep
Edward around on the list (please add a <wink> here).

Does that mean we allow things like "there is a hard-space` `here"? It
would be quite a neat thing to allow...

> Then the only context-dependant characters that remain would
> be start-list-item characters..  And if we wanted to, we could
> use '* ' at the beginning of any list item, since it's
> reserved anyway... something like:
>     * this is an unordered list item
>     *1. this is an ordered list item
> Well.. I'm not sure whether we'd want to do that or not..

As I say elsewhere, this was considered in an earlier round, and in the
end dropped. Personally, I think we're doing OK with the list forms we
already had.

> Does this sound like a reasonable direction to go?

Well, I like it.


Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)