Guido van Rossum [email@example.com] wrote:
I've heard a few people claim that strings should always be considered to contain "characters" and that there should be one character per string element. I've also heard a clamoring that there should only be one string type. You folks have never used Asian encodings. In countries like Japan, China and Korea, encodings are a fact of life, and the most popular encodings are ASCII supersets that use a variable number of bytes per character, just like UTF-8. Each country or language uses different encodings, even though their characters look mostly the same to western eyes. UTF-8 and Unicode is having a hard time getting adopted in these countries because most software that people use deals only with the local encodings. (Sounds familiar?)
Actually a bigger concern that we hear from our customers in Japan is that Unicode has *serious* problems in asian languages. Theey took the "unification" of Chinese and Japanese, rather than both, and therefore can not represent los of phrases quite right. I can have someone write up a better dscription, but I was told by several Japanese people that they wouldn't use Unicode come hell or high water, basically.
Basically it's JJIS, Shift-JIS or nothing for most Japanese companies. This was my experience working with Konica a few years ago as well.