[Doc-SIG] rST hyperlink syntax

Alan Jaffray jaffray@pobox.com
Tue, 9 Oct 2001 21:00:51 -0400 (EDT)

Thanks for the thoughtful reply. :)

> > Third, it's error-prone.  The above example is broken,
> > because the hyperlink points to "...archives" and the target is
> > "..archive".
> This error will be caught and reported by the hyperlink-resolution code
> (coming soon to a CVS tree near you).

It's good to catch mistakes, but it's even better to avoid them in 
the first place. :)

> > It's an error that's easy to make, and easy to miss,
> > and longer repeated text makes for greater likelihood of error.
> True. The author is by no means forced to use long hyperlink phrases though.

I disagree here.  Link text should make sense without the immediate
context - to do otherwise violates `W3C Accessibility Guidelines`_,
and limits the effectiveness of various tools which scan documents
for links - and this often requires that it be long.

.. _W3C Accessibility Guidelines:

> Your example appears to jog my memory (but it could just be deja vu). I seem
> to recall thinking about (& possibly discussing here) or at least reading
> about this kind of construct::
>     .. _Python DOC-SIG mailing list archive:
>     .. _archive:
>     .. _Doc-SIG: http://mail.python.org/pipermail/doc-sig/

That would help, but I still think we need separate name and text.

Suppose I'm writing a 2000-line document and refer to the reStructuredText
specification 27 times.  I'll want to use different link texts depending
on the English context.  Some of these (like "spec") will be too general
to define globally without ambiguity, but I don't want to repeat the URL
in every section I use it.  Nor do I want to have to remember that I've
used "reStructuredText specification" already but using "reStructuredText
draft specification" requires me to go down to the bottom and define
another synonym.

So I'd like the link text to go along with a name which is in turn
associated with the hyperlink, rather than the link text going to
the hyperlink directly.  But you're right that inlining the name is
intrusive.  Maybe we could instead have the link point to the name?

    reStructuredText is cool, see the `reStructuredText specification`_
    to see how cool it is.

    .. _reStructuredText specification: rtxt-spec_

    reStructuredText this, reStructuredText that, reStructuredText some
    other thing, the spec_ says reStructuredText does foo, reStructuredText.

    .. _spec: rtxt-spec_

    [800 lines later]

    .. _rtxt-spec:

Not sure what to do about ambiguity with URLs ending with an underscore.
Quote one or the other?  Use a different syntax?

> Interesting idea: "anonymous hyperlinks". A convenience worth consideration.
> I'd want to avoid the extra syntax if possible though [#]_. The
> ramifications need to be thought through, for ambiguity or conflicts with
> existing syntax.

Even if we don't allow name/text separation, anonymous links would be an
adequate solution for the problem of annoyingly long link text in most
cases.  It'd also be useful for very short or "disposable" documents.
(One-line comments, message board posts, etc.)

> .. [#] Syntax idea: double underscores::
>        For more info, search the `Python DOC-SIG mailing list archives`__.
>        .. __: http://mail.python.org/pipermail/doc-sig/

Nice idea.  I think this would be used often enough in practice that the
design principles demand a simple syntax.  Could we make it even simpler? ::

    For more info, search the `Python DOC-SIG mailing list archives`__.

    .. __http://mail.python.org/pipermail/doc-sig/

The ".. __" incantation is already punctuation-dense.  Extending it 
to ".. __: " before getting around to actual content seems, to me,
to be stretching into the realm of the silly, unnecessarily long
and tedious for both author and reader.  It's already unambiguous
without the colon-space, assuming links starting with underscore
have to be quoted.

>    Similarly to auto-numbered footnotes' "autonumber labels", the final
>    word or words of the hyperlink phrase could optionally be used for the
>    target::
>        .. __archives: http://mail.python.org/pipermail/doc-sig/
>    But why the *final* word(s)? Works in this case, but not all cases.
>    Initial *or* final word(s)? Arbitrary subphrase? Getting complicated...

I'd go with arbitrary subphrase, zero or more full words, as an optional
extension to the colon-less syntax above.  It seems more natural to me -
more like what a human would normally write - than either substring or
first/last word.  It could help catch maintenance errors getting links
and targets mixed up.

However, if it's considered important to only have one way to specify
an anonymous-hyperlink target, I'd ditch subphrase-labels and use the
colon-less form exclusively.