[docs] [issue20906] Issues in Unicode HOWTO
report at bugs.python.org
Sun Mar 16 18:35:54 CET 2014
Antoine Pitrou added the comment:
Do you want to provide a patch?
> In a narrative such as the current article, a code point value is usually written in hexadecimal.
I find use of the word "narrative" intimidating in the context of a technical documentation.
In general, I find it disappointing that the Unicode HOWTO only gives hexadecimal representations of non-ASCII characters and (almost) never represents them in their true form. This makes things more abstract than necessary.
> This is a vague claim. Probably what was intended was: "Many Internet standards define protocols in which the data must contain no zero bytes, or zero bytes have special meaning." Is this actually true? Are there "many" such standards?
I think it actually means that Internet protocols assume an ASCII-compatible encoding (which UTF-8 is, but not UTF-16 or UTF-32 - nor EBCDIC :-)).
> --> "Non-Unicode code systems usually don't handle all of the characters to be found in Unicode."
The term *encoding* is used pervasively when dealing with the transformation of unicode to/from bytes, so I find it confusing to introduce another term here ("code systems"). I prefer the original sentence.
Python tracker <report at bugs.python.org>
More information about the docs