Hi, I'm new to this list but read the recent discussion about embedded hyperlinks in the archives. I find this issue quite important for plain-text readability: While for long URIs, putting them after a paragraph is good, for short URIs, it seems quite unnatural. (Short URIs include references to the root of a web site as well as most relative URIs-- both common cases for hyperlinks.) To give an example [see example.html], I often refer to web sites like this [http://example.com/]. Giving the example__ like this__ seems much less natural, and actually sometimes makes me not put in hyperlinks where I would normally put them (or if they're absolute, put them in like above, which reST renders by showing the URI). __ example.html __ http://example.com/ Of course, things like this_@_http://example.com would not help me with my problem in the least ;-) The 'most obvious' syntax extension to reST would, for me, be having the URI-in-square-brackets trap the last word, or last `backquoted phrase`, like the underscore put right after the word/phrase does. Obviously this doesn't work, since reST already interprets these differently. I have come up with an only slightly different possible syntax that doesn't seem to have been discussed before. The first goes like this: This is an example [-> http://www.example.com] of `embedded links` [-> http://www.example.com/links/]. I find it especially comfortable with relative URIs: To learn more about Foo [-> foo.html], go to the `class documentation` [-> class-foo.html]. Or go back to the index [-> ../index.html]. I really like the plain-text readability of this syntax. It does not use the underscores that usually indicate links within reST and is therefore inconsistent; I don't see this as a major problem, because I find the syntax quite indicative of its meaning, but if this is seen as a problem, another possibility is: This is an example [__ http://www.example.com]. However, I find that less readable. Both syntaxes share the problem that the markup that makes a word/a backquoted phrase into a hyperlink is not directly connected to the word/phrase itself. However, they are still unambiguous, and especially the first jumps clearly out to my eye. My personal feeling is that some conceptional purity should be sacrificed for plain-text readability here. Thanks for listening! - Benja
[Benja Fallenstein]
`embedded links` [-> http://www.example.com/links/].
I find this very readable, although his point that it breaks the __ hyperlink connection is true.
This is an example [__ http://www.example.com]. However, I find that less readable.
This works for me, too. -Brett
Thanks for giving this so much attention. Notating hyper-links is a feature that will get a lot of use in venue's outside pure documentation: Message boards, Wiki's, etc. Replacing the simplicity of "text":http://link isn't an easy job, especially trying to overcome it's faults. - Dean --- Brett Cannon <bac@OCF.Berkeley.EDU> wrote:
[Benja Fallenstein]
`embedded links` [->
http://www.example.com/links/].
I find this very readable, although his point that it breaks the __ hyperlink connection is true.
This is an example [__
However, I find that less readable.
This works for me, too.
-Brett
_______________________________________________ Doc-SIG maillist - Doc-SIG@python.org http://mail.python.org/mailman/listinfo/doc-sig
__________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site http://webhosting.yahoo.com
Brett Cannon wrote:
[Benja Fallenstein]
`embedded links` [-> http://www.example.com/links/].
I find this very readable, although his point that it breaks the __ hyperlink connection is true.
I've come up with a third variation that doesn't break the _ convention as much as the other two: An `example hyperlink` <http://example.com>_. Here, everything from the first backquote to the closing angle bracket can be seen as being made a hyperlink by the underscore. As long as there's only a single underscore, I find this still quite readable in plaintext; I read the backquotes as 'link markers' and the angle brackets as the specification where the link goes, and I ignore the final underscore. :-) (It *is* necessary to distinguish from XML/SGML tags.) I find this variation even a little less obstrusive than the square bracket one. Depends on the context. Relative links are nice this way: The `specification` <spec.html>_ is explicit about this: no `identifier` <terms/identifier.html>_ may appear outside the `correct context` <#context>_. - Benja
Benja Fallenstein wrote:
I'm new to this list but read the recent discussion about embedded hyperlinks in the archives. I find this issue quite important for plain-text readability: While for long URIs, putting them after a paragraph is good, for short URIs, it seems quite unnatural. (Short URIs include references to the root of a web site as well as most relative URIs-- both common cases for hyperlinks.) To give an example [see example.html], I often refer to web sites like this [http://example.com/]. Giving the example__ like this__ seems much less natural, and actually sometimes makes me not put in hyperlinks where I would normally put them (or if they're absolute, put them in like above, which reST renders by showing the URI).
__ example.html __ http://example.com/
Well put.
The 'most obvious' syntax extension to reST would, for me, be having the URI-in-square-brackets trap the last word, or last `backquoted phrase`, like the underscore put right after the word/phrase does. Obviously this doesn't work, since reST already interprets these differently. I have come up with an only slightly different possible syntax that doesn't seem to have been discussed before.
The first goes like this:
This is an example [-> http://www.example.com] of `embedded links` [-> http://www.example.com/links/].
There are several problems with this syntax. First, there's no indication that the word "example" is a reference until we see the target (as you mention further on). The "reference" role magically jumps back from the target to the word. Second, there's already a meaning associated with plain-backquoted words & phrases (no underscores): interpreted text. (I know it doesn't actually do anything yet; consider it reserved syntax.) The proposed syntax would require back-tracking in the parser, which I have no desire to implement. Adding trailing underscores solves both problems, resulting in:: This is an example__ [-> http://www.example.com] of `embedded links`__ [-> http://www.example.com/links/].
I really like the plain-text readability of this syntax.
I agree, it *is* quite obvious at first glance what's going on. But the syntax is quite noisy and doesn't fit well with the rest of reStructuredText. A debatable point. Read on for a showstopper.
It does not use the underscores that usually indicate links within reST and is therefore inconsistent; I don't see this as a major problem, because I find the syntax quite indicative of its meaning, but if this is seen as a problem, another possibility is:
This is an example [__ http://www.example.com].
Same problem with back-tracking. The underscores are not in a useful place. Unfortunately, there's a final, showstopper problem with this syntax: RFC 2732 ("Format for IPv6 Literal Addresses in URL's") adds the "[" and "]" characters to the set of possible URI characters. This means we can't surround URIs with "[]" with the current parser, which is intentionally limited in its inline markup parsing ability (uses regexps). Here's an example:: http://[3ffe:2a00:100:7031::1]/ In fact, because IPv6 literal addresses end with "]", the parser specifically allows a "]" at the end of a URI. So if you put [http://example.com] in your text, the final bracket would mistakenly be included in the URI. I think that it's fair to say that a bracketed URI is much more likely than a standalone IPv6 URI, so I fixed the regexp to favor the former.
Both syntaxes share the problem that the markup that makes a word/a backquoted phrase into a hyperlink is not directly connected to the word/phrase itself. However, they are still unambiguous, and especially the first jumps clearly out to my eye.
Unfortunately not unambiguous. The human eye/brain combination is much more flexible and forgiving of exceptions than program code and IETF specs ;) [in a follow-up:]
I've come up with a third variation that doesn't break the _ convention as much as the other two:
An `example hyperlink` <http://example.com>_.
From there it's a *very* short step back to::
An `example hyperlink <http://example.com>`_. One underscore means "named", two means "anonymous", same as in the rest of the cases.
Thanks for listening!
Thanks for writing. While the proposed syntax variations didn't win me over, your initial rationale has provided the final nudge to convince me that such a construct is a worthwhile addition to reStructuredText. -- David Goodger <goodger@python.org> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/
Hi David! David Goodger wrote:
Unfortunately, there's a final, showstopper problem with this syntax: RFC 2732 ("Format for IPv6 Literal Addresses in URL's") adds the "[" and "]" characters to the set of possible URI characters. This means we can't surround URIs with "[]" with the current parser, which is intentionally limited in its inline markup parsing ability (uses regexps). Here's an example::
Wow. Oops. Ok, point well taken; I've always missed that update to RFC 2396, so far, and have assumed that [] are still reserved URI chars.
[in a follow-up:]
I've come up with a third variation that doesn't break the _ convention as much as the other two:
An `example hyperlink` <http://example.com>_.
From there it's a *very* short step back to::
An `example hyperlink <http://example.com>`_.
One underscore means "named", two means "anonymous", same as in the rest of the cases.
Well, yes, it can be argued that it is just one backquote moving a little forwards. Ultimately, this is a question of taste, but I still find that the first version is quite a bit more readable; there, I'm able to parse the backquotes as a marker for the extent of the link (as in `example hyperlink`_), and the angle bracketed text as an annotation to the link-- my interpretation of the syntax is, `example hyperlink`_ with an intersparsed annotation that gives the URI inline. With `example hyperlink <http://example.com>`_, on the other hand, I find it harder to ignore the URI when reading: my eyes search for the corresponding closing marker to the first backquote, which in the context of reST I interpret as an opening marker (like an opening bracket). What happens is that the URI jumps into the foreground (because it's immediately before the closing backquotes my eyes are searching for) and doesn't any more look like the annotation I'm used to from plain text. Now, I can understand that you don't want to implement backtracking in the parser for this, but I don't actually see why that's necessary (then again, I'm still trying to grasp how the parser works, so if I'm misinterpreting here, I'd be glad for being corrected). As far as I can see, in ``parsers/rst/states.py``, you already distinguish between inline literals and single-backquoted text; then at a latter point I think you further distinguish between single-backquoted phrase refs (underscore at end) and single-backquoted domain-specific text (no underscore at end). How about simply introducing another case, inline hyperlinks? The opening marker would be a single backquote (i.e., a backquote not preceded or followed by another backquote, as currently). The closing marker would be identified by the following regular expression:: r'`\s*<' + uri + r'>_' (Can be improved by allowing for a second underscore at the end and checking that whitespace or punctuation follows.) Possibly we'd have to do a little more parsing to get the URI out of the angle brackets, but that won't be hard. -- Ok, maybe this isn't extremely beautiful, but from what I understand now it could work without implementing backtracking. Again, it's a matter of taste to decide whether this is worth the effort; because of the reasons above, in my humble opinion, it is ;-) - Benja
[Benja]
I've come up with a third variation that doesn't break the _ convention as much as the other two:
An `example hyperlink` <http://example.com>_.
[David]
From there it's a *very* short step back to::
An `example hyperlink <http://example.com>`_.
[Benja]
Well, yes, it can be argued that it is just one backquote moving a little forwards. Ultimately, this is a question of taste,
No, it's not just taste. The `text in single backquotes` syntax is already used for the interpreted text construct. That's not going to be compromised.
but I still find that the first version is quite a bit more readable
I can understand that, and sympathize. But the realities of the markup means that "`this <url>`_" is feasible, whereas "`this` <url>_" isn't. The latter would complicate the code and the markup model more than the feature is worth.
With `example hyperlink <http://example.com>`_, on the other hand, I find it harder to ignore the URI when reading
Putting it bluntly: so what? It's irrelevant. From the recently revised spec: .. Caution:: This construct offers easy authoring and maintenance of hyperlinks at the expense of general readability. Inline URIs, especially long ones, inevitably interrupt the natural flow of text. For documents meant to be read in source form, the use of independent block-level `hyperlink targets`_ is **strongly** recommended. The embedded URI construct is most suited to documents intended *only* to be read in processed form.
How about simply introducing another case, inline hyperlinks? The opening marker would be a single backquote (i.e., a backquote not preceded or followed by another backquote, as currently). The closing marker would be identified by the following regular expression::
r'`\s*<' + uri + r'>_'
That's just an end-run around the fact that "`this` <url>_" is two separate things, and the first has an independent meaning. Such an overloading would be a huge wart. How could I possibly explain it?
Ok, maybe this isn't extremely beautiful, but from what I understand now it could work without implementing backtracking.
The backtracking issue is minor compared to the interpreted text issue, which is a show-stopper. The construct has been implemented, and "`this <url>`_" is the syntax. Be glad that it's been implemented at all! It's an ugly convenience though. The ugliness was enough to delay implementation for 5 months, and almost enough to prevent implementation altogether. -- David Goodger <goodger@python.org> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/
David Goodger wrote:
The construct has been implemented, and "`this <url>`_" is the syntax. Be glad that it's been implemented at all! It's an ugly convenience though. The ugliness was enough to delay implementation for 5 months, and almost enough to prevent implementation altogether.
It is implemented? Ok, I am glad. :-) I accept your decision, of course, but I'll reply to your other points below, anyway, because I do not agree with your reasoning: I believe that my proposed construct does not conflict with interpreted text (I would not have proposed it otherwise). It can still be seen as ugly, and if you choose to reject it, I don't argue-- the point of this post (as well as the last) is just to get the facts straight.
[Benja]
Well, yes, it can be argued that it is just one backquote moving a little forwards. Ultimately, this is a question of taste,
No, it's not just taste. The `text in single backquotes` syntax is already used for the interpreted text construct. That's not going to be compromised.
In my mind, `example` <example.html>_ is a single construct, just as `example`_ is; it is as well distinguished from `example` as `example`_ is. This is true to the human eye (which recognizes the angle brackets and URI), to the parser (which distinguishes the two constructs by the regexps), and to the syntax (which defines backquoted text to be a link if followed by a underscore, *with an optional whitespace-angle-url-angle construct in between*, and as interpreted text otherwise). In other words, on all three levels, the difference between the three constructs is clear; but while the inner logic in the syntax is that the underscore is the signifier, and the angle-bracketed URI is inserted between the phrase and the underscore, the eye uses the URI as the signifier that the quoted text is a link. (It's quite hard to come up with an example where the author didn't put an underscore because they intend the interpreted text + angle-bracketed text interpretation, while a reader could reasonably assume that a link is meant; thus, I don't see this as a problem.) The ugly thing is that this allows whitespace inside a construct, of course. This is what I see as the main tradeoff here: More readability vs. no whitespace in constructs.
but I still find that the first version is quite a bit more readable
I can understand that, and sympathize. But the realities of the markup means that "`this <url>`_" is feasible, whereas "`this` <url>_" isn't. The latter would complicate the code and the markup model more than the feature is worth.
Hm, looking at the parser, I think it should be possible to reduce the code complication to some additional regexp trickery. I can give it a try if states.py's complexity is what you're worried about. OTOH, about the complication in the markup model (whitespace in constructs) there is no arguing.
With `example hyperlink <http://example.com>`_, on the other hand, I find it harder to ignore the URI when reading
Putting it bluntly: so what? It's irrelevant. From the recently revised spec:
.. Caution::
This construct offers easy authoring and maintenance of hyperlinks at the expense of general readability. Inline URIs, especially long ones, inevitably interrupt the natural flow of text. For documents meant to be read in source form, the use of independent block-level `hyperlink targets`_ is **strongly** recommended. The embedded URI construct is most suited to documents intended *only* to be read in processed form.
Now this I don't agree with at all. As I said in my first post, I'm not interested in long inline URIs (the current mechanism handles longer URIs better), but I find short inline URIs very common in plain text. I use them all the time, absolute URIs for references to web sites in e-mails (though I don't include the http:// there) and relative URIs for references to other files in documentation. In plain text, I put the URI/file name right next to the reference, so that readers can see what I'm talking about, to go to that site/file or maybe copy&paste it into a browser. In HTML, I want a clickable link made out of that. I don't mind if you use block-level constructs even for really short URIs, but I don't agree at all that short URIs interrupt the natural flow-- au contraire. (Wouldn't be discussing this otherwise :-) )
How about simply introducing another case, inline hyperlinks? The opening marker would be a single backquote (i.e., a backquote not preceded or followed by another backquote, as currently). The closing marker would be identified by the following regular expression::
r'`\s*<' + uri + r'>_'
That's just an end-run around the fact that "`this` <url>_" is two separate things, and the first has an independent meaning. Such an overloading would be a huge wart. How could I possibly explain it?
Hm, I guess I've explained my take on that above (ask if it's still unclear). That's what you get when you quote e-mail out of order :-) Just to reiterate: I accept your decision, I just want to be sure that you understand my proposal the way I meant it and base the decision on that. - Benja
Benja Fallenstein wrote:
In my mind, `example` <example.html>_ is a single construct, just as `example`_ is; it is as well distinguished from `example` as `example`_ is. This is true to the human eye
Not to my eye, not to my mind. The underscore has to be *immediately* adjacent to the backquote to provide an alternate meaning. After the URL is just too far away: action from a distance. To me, the `reftext <url>`_ construct says "a reference with text 'reftext' (and oh, by the way, it's a reference directly to this URL)". The angle brackets serve to parenthesize the URL within the reference, and "`...`_" encloses the whole.
(It's quite hard to come up with an example where the author didn't put an underscore because they intend the interpreted text + angle-bracketed text interpretation, while a reader could reasonably assume that a link is meant; thus, I don't see this as a problem.)
Not so hard. For this example, assume the default role of interpreted text is to indicate index entries: The `HTML element` <a> is used for hyperlinks.
The ugly thing is that this allows whitespace inside a construct, of course. This is what I see as the main tradeoff here: More readability vs. no whitespace in constructs.
It's not just whitespace; it's one thing vs. two, and the first thing isn't what it normally is because of the end of the second thing.
Hm, looking at the parser, I think it should be possible to reduce the code complication to some additional regexp trickery. I can give it a try if states.py's complexity is what you're worried about.
I'm not worried about the code. Much more important is the conceptual complication to the markup model.
I don't mind if you use block-level constructs even for really short URIs, but I don't agree at all that short URIs interrupt the natural flow-- au contraire. (Wouldn't be discussing this otherwise :-) )
We agree to disagree :)
Just to reiterate: I accept your decision, I just want to be sure that you understand my proposal the way I meant it and base the decision on that.
I understand it, and I thank you for your input, but it's just one data point in a series. I've been sitting on this proposal for 5 months, giving it *lots* of thought. Many people have chimed in, and I'm tired of the debate. My job is to make the best decision for the project as a whole, and I believe that has been done. Let's move on. -- David Goodger <goodger@python.org> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/
David-- David Goodger wrote:
Benja Fallenstein wrote:
Just to reiterate: I accept your decision, I just want to be sure that you understand my proposal the way I meant it and base the decision on that.
I understand it, and I thank you for your input, but it's just one data point in a series.
Of course.
I've been sitting on this proposal for 5 months, giving it *lots* of thought. Many people have chimed in, and I'm tired of the debate. My job is to make the best decision for the project as a whole, and I believe that has been done. Let's move on.
Ok. Thanks for listening, and thanks for developing ReST! - Benja
participants (4)
-
Benja Fallenstein -
Brett Cannon -
David Goodger -
Dean Goodmanson