[Doc-SIG] Idea: make double-space between sentences meaningful
Beni Cherniavsky
cben at users.sf.net
Sat May 15 18:57:57 EDT 2004
Some formats (notably LaTeX) support the typographical convention (of
some languages, e.g. English but not French IIRC) of putting a bigger
space after the end of a sentence than between words. LaTeX tries to
guess intellegently but can fail. Its guessing can be explicitly
overriden [1]_.
Currently, reST provides no way to convey this information to the output
format. Producing high-quality output requires this information. There
already exists an obvious convention supported by programs (e.g. Emacs)
for representing it in plain text: just use a double space after the end
of a sentence. I propose to make this official for reStructuredText:
more than one space between words after punctuation [2]_ signifies a
sentence end [3]_.
Backward compatiblity: at worse, it will force all sentence ends to
single spaces in existing documents that don't use the double-space
convention in the reST source. It's a good bet that anybody who cares
about it in his LaTeX output also cares about his source, but it's a
good idea to make this a parser option (defaulting off?)...
It is even possible, if desired, to support this in HTML output, using
some hack (`` `` won't do because we *want* it to be breakable -
it's even better there; perhaps ``<span class="sentence-end"> </span>``
with appropriate CSS?).
.. [1] By using ``\@.`` for a sentence end and ``.\ `` for a sentence
non-end. See `The Not So Short Introduction to LaTeX 2ε`__,
section 2.6.
__ http://www.ctan.org/tex-archive/info/lshort/english/lshort.pdf
.. [2] Punctuation should be taken in a wide sense of the word. E.g.
many people end a sentence with a smiley without putting a period
after it ;-).
.. [3] A period at end-of-line should be considered a sentence end per
Emacs conventions (it acutally avoids putting non-sentence-end
periods at end-of-line when refilling paragraphs!). However, if
there is a trailing whitespace, it should be used to decide (in
the style of RFC 2646 - wrap lines *after* the whitespace - which
is the only unambiguos way to retain spacing info at line ends;
some editors (pico/nano) use this only when there is more than
one space - this algorithm will support them all).
--
Beni Cherniavsky <cben at users.sf.net>
Note: I can only read email on week-ends...
More information about the Doc-SIG
mailing list