[Python-Dev] Dash

Steven D'Aprano steve at pearwood.info
Fri Jul 19 06:51:39 CEST 2013


On 19/07/13 04:46, Serhiy Storchaka wrote:
> 18.07.13 20:48, Guido van Rossum написав(ла):
>> I believe there are only a few places where en-dashes should be used,
>> for most things you should use either em-dash or hyphen. Consult your
>> trusted typography source (for US English, please, punctuation
>> preferences vary by locale). E.g. Google for "em dash en dash".
>
> Currently Python documentation in most cases uses en-dashes. Should we replace them to em-dashes? Should we remove spaces around dashes?
>
> Or we should replace a half-dozen of em-dashes found in Python documentation to en-dashes?
>
> I believe all hypens used in place of dash should be replaced to dash (but to en- or em- dash?) in any case.

It depends on the context, and I don't believe you could completely automate the process (at least not without using something that understands natural language, like NLTK, and probably not even then). I think it will require a human reader to review them, like any other style and grammar edit.

Wikipedia has a good overview which mostly agrees with my typesetting and style books:

https://en.wikipedia.org/wiki/Dash

Hyphens are commonly used for compound words, although in practice hyphenated words gradually lose the hyphen. E.g. we usually write "inbox" rather than "in-box", although we still write "in-tray". Hyphens are also used at the end of the line to break a word to the next line.

En-dashes are used for durations and ranges (sometimes with a thin space on either side, otherwise a regular space can be used). E.g. "October–December".

En-dash is also used when making a compound word from words which themselves are compound words, e.g. "The pre–World War II economy" joins "pre-" with "World War II", not just "World".

Em-dashes are used for parenthetical asides, or to indicate a break in speech or thought. They are often used when a comma is too weak and a period is too strong—a bit like a colon.

Different sources give different recommendations regarding spaces around dashes. The Chicago Manual of Style agrees with most British sources that em-dashes should never have spaces around them, but the New York Times style guide sets hair-spaces around them. Unusually for me, I tend to agree with the NY Times on this one. A regular space is usually too wide.

Many of these conventions are style conventions, rather than strictly grammatical. For example, although English grammar says we can use an en-dash to make ranges of numbers, the SI standard recommends against expressions like "10–50 volts" since it can be mistaken for subtraction, and recommends "10 to 50 volts".

Optimistically, I think it would probably be safe[1] to replace " -- " or " --- " in text with "\N{THIN SPACE}\N{EM DASH}\N{THIN SPACE}" (or \N{HAIR SPACE} if you prefer) without human review, but for any other changes, I wouldn't even try to automate it.






[1] Famous last words.


-- 
Steven


More information about the Python-Dev mailing list