[Python-Dev] Unicode debate
Christopher Petrilli
petrilli@amber.org
Thu, 27 Apr 2000 12:48:16 -0400
Guido van Rossum [guido@python.org] wrote:
> I've heard a few people claim that strings should always be considered
> to contain "characters" and that there should be one character per
> string element. I've also heard a clamoring that there should only be
> one string type. You folks have never used Asian encodings. In
> countries like Japan, China and Korea, encodings are a fact of life,
> and the most popular encodings are ASCII supersets that use a variable
> number of bytes per character, just like UTF-8. Each country or
> language uses different encodings, even though their characters look
> mostly the same to western eyes. UTF-8 and Unicode is having a hard
> time getting adopted in these countries because most software that
> people use deals only with the local encodings. (Sounds familiar?)
Actually a bigger concern that we hear from our customers in Japan is
that Unicode has *serious* problems in asian languages. Theey took
the "unification" of Chinese and Japanese, rather than both, and
therefore can not represent los of phrases quite right. I can have
someone write up a better dscription, but I was told by several
Japanese people that they wouldn't use Unicode come hell or high
water, basically.
Basically it's JJIS, Shift-JIS or nothing for most Japanese
companies. This was my experience working with Konica a few years ago
as well.
Chris
--
| Christopher Petrilli
| petrilli@amber.org