[Doc-SIG] Alternative inline markup
Alan Jaffray
jaffray@pobox.com
Wed, 7 Nov 2001 14:05:25 -0500 (EST)
On Tue, 6 Nov 2001, David Goodger wrote:
> Thanks for your input. I wish it had come earlier though!
Sorry. Combination of illness and much-needed vacation.
> > 1) Inline markup can be nested::
> ...
> > Ambiguity is resolved with close tags first, then maximal munch.
>
> I don't follow. Clarify please?
Forget it, bad idea. :)
> The easiest way I see to implement this is to first identify the outer
> inline markup as we do now, then recursively scan for nested inline
> markup. It won't work in the general case though, as explained below.
> Changing the parse algorithm to be fully nested-inline-markup-friendly
> could be difficult and/or ambiguous. I'm not sure the general case
> *can* be done, without a lot of exceptions and special rules, which
> complexity I'm not willing to add to reStructuredText.
Is "markup can't be delimited with the same character as its parent"
too complicated?
> > 2) An underscore suffix currently modifies the preceding text by
> > making it a link. This notion is extended - the suffix indicates
> > that the text is to be tagged in some way, indicated by a
> > directive or destination URL in the target::
> >
> > I had lunch with Jonathan_ today. We talked about Zope_.
> >
> > .. _Jonathan: lj [user=jhl]
> > .. _Zope: http://www.zope.org/
>
> Interesting idea, putting arbitrary constructs in the link target.
> However, for consistency that depends on two things:
>
> 1. The link text remains behind, untouched except for being
> "activated" in some way.
> 2. There must *be* a link target. Corollary: the reference must *be*
> a reference.
I agree with (2) but not (1). Here's the principle I'm going on:
A reStructuredText-to-plaintext converter should modify the
non-directive parts of the document as little as possible.
The marked-up text should "read" like non-marked-up text.
> What will "Jonathan" become?
``<tag name="lj" args="[user=jhl]">Jonathan</tag>`` or some such.
After that, it's an output format issue.
For the given application I would expect the default text output to
be ``Jonathan`` and the default HTML output to be::
<a href="/userinfo/jhl"><img src="/icons/jhl"></a>
<a href="/users/jhl"><b>Jonathan</b></a>
but I believe the HTML can be customized through the style engine.
It has also been suggested that the lj-user tags could be used to
track "who am I talking about" or "who's talking about whom".
> For example, say I want Jonathan's user
> icon to appear in my paragraph::
>
> I had lunch with [Jonathan's icon here] today.
>
> How do I do this *without* having a hyperlink at the same time?
The way you'd write this paragraph in plaintext is::
I had lunch with Jonathan today.
This implies that the reStructuredText paragraph should be::
I had lunch with Jonathan_ today.
Then follow it with::
.. _Jonathan: lj-icon jhl
or the like. If you're really referring to the icon itself, rather than
referring to Jonathan but using his icon in graphical output, then you'd
say something like::
Jonathan has a goofy icon: `Jonathan's icon`__
__ lj-icon jhl
> On the other hand, we could say that the trailing-underscore syntax
> doesn't signify a hyperlink reference, but only indicates a "tagging
> reference".
Yes.
> A tagging reference becomes a hyperlink reference if the contents of
> the "tag" resolve to a hyperlink.
Or, rather, a hyperlink *is* a type of tag, and ``__ http://python.org``
is just sugar for ``__ link http://python.org``. We're not adding a
construct. We're replacing a construct with a more general one.
> > Link targets which are also legal directive names must be
> > enclosed in backquotes.
>
> The frequency of link targets would far outweigh directives, so
> markup would suffer from extra syntax on targets.
Anything with a slash or an at-sign doesn't need to be escaped.
This is the vast majority of cases. In all the reStructuredText
documentation, you don't have a single target that would need
quoting. The fraction of links on my site that would require
quoting is also tiny.
> > 4) Inside markup delimited by backquotes or curly braces, curly
> > braces may be used as delimiters equivalent to backquotes::
> ...
> > This is because backquotes don't nest.
>
> There's no difference between backquotes and asterisks with regard to
> nesting.
True. Asterisks don't nest either. :-) I guess I wasn't clear.
I should have said "inside tagged content, curly braces may be
used to delimit tagged content". I'm referring solely to::
`Putting {a tag}_ inside another tag`_
> Why the fixation on curly braces? :>
We only have four pairs of nesting characters on the keyboard, and
everyone wants them. I think braces are more available than the
other three. (And I say this despite wanting to use reST for other
programming languages besides Python; Python is relatively light
on braces.)
> > 5) Roles can go away. We don't need them. Optionally if we want
> > the ability to put short directive names inline, we could
> > declare ::
> >
> > `foo:: bar bar bar`
>
> Similar syntax has already been considered and rejected. See
> http://structuredtext.sf.net/spec/alternatives.txt, "Interpreted Text
> 'Roles'" alternative 1.
Alternative 1 is more ambiguous than what I'm suggesting, and does
not have the benefit of consistency with out-of-line directives.
However, which syntax to use for simple inline directives is a minor
side issue, and I shouldn't have combined it with this proposal. More
important is whether "roles" and "directives" and "tags" should all be
unified. I think they should. It adds both simplicity and power.
> > Summary:
> >
> > - We gain nesting.
>
> Not without significant work, though. If it's even possible
> unambiguously, it can be added independently later.
I don't mind writing code, but I'd rather not fork to do it.
> > - We gain arbitrary extensibility of inline markup.
> > - We gain substitutions.
>
> But at the expense of complicating hyperlinks.
> > - We lose by occasionally having to escape a curly brace inside
> > backquotes, or quote a hyperlink target with no ``/`` or ``#``
> > characters to distinguish it from a directive name.
>
> That last one is a significant loss.
I really don't think so. Cases where you have to quote the target
are few and far between.
Alan