14 Sep
2004
14 Sep
'04
1:33 a.m.
Terry Reedy wrote:
usually shorter in languages with many ideographs (my non-scientific tests indicate that chinese text uses about 4 times less symbols than english; I'm sure someone can dig up better figures).
This is why I am not especially enamored of Unicode and the prospect of Python becoming married to it. It is heavily weighted in favor of efficiently representing Chinese and inefficiently representing English.
Don't confuse Unicode with its UCS-2 and UCS-4 encodings. On a conceptual level, good old 7-bit ASCII and 8-bit ISO-Latin-1 are both Unicode. </F>