[Doc-SIG] Alternative inline markup

Alan Jaffray jaffray@pobox.com
Wed, 7 Nov 2001 14:05:25 -0500 (EST)

On Tue, 6 Nov 2001, David Goodger wrote:
> Thanks for your input. I wish it had come earlier though!

Sorry.  Combination of illness and much-needed vacation.

> > 1) Inline markup can be nested::
> ...
> >    Ambiguity is resolved with close tags first, then maximal munch.
> I don't follow. Clarify please?

Forget it, bad idea. :)

> The easiest way I see to implement this is to first identify the outer
> inline markup as we do now, then recursively scan for nested inline
> markup. It won't work in the general case though, as explained below.
> Changing the parse algorithm to be fully nested-inline-markup-friendly
> could be difficult and/or ambiguous. I'm not sure the general case
> *can* be done, without a lot of exceptions and special rules, which
> complexity I'm not willing to add to reStructuredText.

Is "markup can't be delimited with the same character as its parent"
too complicated?

> > 2) An underscore suffix currently modifies the preceding text by
> >    making it a link. This notion is extended - the suffix indicates
> >    that the text is to be tagged in some way, indicated by a
> >    directive or destination URL in the target::
> > 
> >       I had lunch with Jonathan_ today.  We talked about Zope_.
> > 
> >       .. _Jonathan: lj [user=jhl]
> >       .. _Zope: http://www.zope.org/
> Interesting idea, putting arbitrary constructs in the link target.
> However, for consistency that depends on two things:
> 1. The link text remains behind, untouched except for being
>    "activated" in some way.
> 2. There must *be* a link target. Corollary: the reference must *be*
>    a reference.

I agree with (2) but not (1).  Here's the principle I'm going on:
A reStructuredText-to-plaintext converter should modify the
non-directive parts of the document as little as possible.
The marked-up text should "read" like non-marked-up text.

> What will "Jonathan" become?

``<tag name="lj" args="[user=jhl]">Jonathan</tag>`` or some such.
After that, it's an output format issue.

For the given application I would expect the default text output to
be ``Jonathan`` and the default HTML output to be::

    <a href="/userinfo/jhl"><img src="/icons/jhl"></a>
    <a href="/users/jhl"><b>Jonathan</b></a>
but I believe the HTML can be customized through the style engine.
It has also been suggested that the lj-user tags could be used to
track "who am I talking about" or "who's talking about whom".

> For example, say I want Jonathan's user
> icon to appear in my paragraph::
>     I had lunch with [Jonathan's icon here] today.
> How do I do this *without* having a hyperlink at the same time?

The way you'd write this paragraph in plaintext is::

    I had lunch with Jonathan today.

This implies that the reStructuredText paragraph should be::

    I had lunch with Jonathan_ today.

Then follow it with::

    .. _Jonathan: lj-icon jhl

or the like.  If you're really referring to the icon itself, rather than
referring to Jonathan but using his icon in graphical output, then you'd
say something like::

    Jonathan has a goofy icon: `Jonathan's icon`__

    __ lj-icon jhl

> On the other hand, we could say that the trailing-underscore syntax
> doesn't signify a hyperlink reference, but only indicates a "tagging
> reference".


> A tagging reference becomes a hyperlink reference if the contents of
> the "tag" resolve to a hyperlink.

Or, rather, a hyperlink *is* a type of tag, and ``__ http://python.org``
is just sugar for ``__ link http://python.org``.  We're not adding a 
construct.  We're replacing a construct with a more general one.

> >    Link targets which are also legal directive names must be
> >    enclosed in backquotes.
> The frequency of link targets would far outweigh directives, so
> markup would suffer from extra syntax on targets.

Anything with a slash or an at-sign doesn't need to be escaped.
This is the vast majority of cases.  In all the reStructuredText
documentation, you don't have a single target that would need
quoting.  The fraction of links on my site that would require
quoting is also tiny.

> > 4) Inside markup delimited by backquotes or curly braces, curly
> >    braces may be used as delimiters equivalent to backquotes::
> ...
> >    This is because backquotes don't nest.
> There's no difference between backquotes and asterisks with regard to
> nesting.

True.  Asterisks don't nest either. :-)  I guess I wasn't clear.
I should have said "inside tagged content, curly braces may be
used to delimit tagged content".  I'm referring solely to::

    `Putting {a tag}_ inside another tag`_

> Why the fixation on curly braces? :>

We only have four pairs of nesting characters on the keyboard, and
everyone wants them.  I think braces are more available than the
other three.  (And I say this despite wanting to use reST for other
programming languages besides Python; Python is relatively light
on braces.)

> > 5) Roles can go away.  We don't need them.  Optionally if we want
> >    the ability to put short directive names inline, we could
> >    declare ::
> > 
> >       `foo:: bar bar bar`
> Similar syntax has already been considered and rejected. See
> http://structuredtext.sf.net/spec/alternatives.txt, "Interpreted Text
> 'Roles'" alternative 1.

Alternative 1 is more ambiguous than what I'm suggesting, and does
not have the benefit of consistency with out-of-line directives.

However, which syntax to use for simple inline directives is a minor
side issue, and I shouldn't have combined it with this proposal.  More
important is whether "roles" and "directives" and "tags" should all be
unified.  I think they should.  It adds both simplicity and power.

> > Summary:
> > 
> > - We gain nesting.
> Not without significant work, though. If it's even possible
> unambiguously, it can be added independently later.

I don't mind writing code, but I'd rather not fork to do it.

> > - We gain arbitrary extensibility of inline markup.
> > - We gain substitutions.
> But at the expense of complicating hyperlinks.

> > - We lose by occasionally having to escape a curly brace inside
> >   backquotes, or quote a hyperlink target with no ``/`` or ``#``
> >   characters to distinguish it from a directive name.
> That last one is a significant loss.

I really don't think so.  Cases where you have to quote the target
are few and far between.