[docs] typos etc at /howto/unicode.htm
chdb at blueyonder.co.uk
Sun Apr 22 18:17:42 CEST 2012
I was reading http://docs.python.org/howto/unicode.html
It's interesting and informative but I encountered some slightly peculiar or
even misleading remarks
In general, I try not to be pedantic, but I find that some technical topics,
such this one on character and encodings, are so tricky and so inherently
liable to be confusing, that the slightest peculiarity easily becomes a
EG in the 1st paragraph which I reproduce below (the bold face is mine)
History of Character Codes
. . .
(Actually the missing accents matter for English, too, which contains words
such as naïve and café, and some publications have house styles which
require spellings such as coöperate.)
For a while people just wrote programs that didnt display accents. I
remember looking at Apple ][ BASIC programs, published in French-language
publications in the mid-1980s, that had lines like these:
PRINT "FICHIER EST COMPLETE."
PRINT "CARACTERE NON ACCEPTE."
Those messages should contain accents, and they just look wrong to someone
who can read French.
I would like to make the following points-
1. "some publications have house styles which require spellings such
as coöperate This example was unrecognisable and very surprising to me.
After a little research I found that it is so vanishingly exceptional that
the writer can only be referring to the New Yorker magazine, which is almost
totally unique in insisting on this and similar spellings. Why not use
straightforward examples that are actually recognisable across the
English-speaking and Python-using world, like the names André, Zoë, Noël or
Renée ? After all, only a very small proportion of readers of Python
documentation are also readers of New Yorker but all have surely encountered
such names - for example among unfortunate but unavoidable celebrities.
2. "Apple ][ BASIC" I think this must be a typo for "Apple II
3. Although I am not French and have a very basic knowledge of the
language, I think the writer has again chosen poor examples - so poor that
the last sentence is simply not true. No - these examples would be perfectly
acceptable to French readers (especially in the mid 1980s). This is because,
in French it is still perfectly normal to miss out all the accents on upper
case letters, and this is why such messages were nearly always capitalised
in the mid 1980s. Therefore, for the sake of the argument (as opposed to
any historical accuracy of the anecdote), you need to quote these examples
like this -
PRINT "Fichier est complete."
PRINT "Caractere non accepte."
4. The writer sometimes uses the term "base-16" (with hyphen) and
sometimes "base 16" (no hyphen). The difference is trivial enough but I was
momentarily a little thrown because I am used to the more usual term
"hexadecimal". Is this term "base-16" the normal usage in the Python
community or is the writer being a little obtuse?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the docs