Re: [Python-Dev] Unicode debate

27 Apr 2000

      Guido van Rossum [guido@python.org] wrote:
...
I've heard a few people claim that strings should always be considered
to contain "characters" and that there should be one character per
string element.  I've also heard a clamoring that there should only be
one string type.  You folks have never used Asian encodings.  In
countries like Japan, China and Korea, encodings are a fact of life,
and the most popular encodings are ASCII supersets that use a variable
number of bytes per character, just like UTF-8.  Each country or
language uses different encodings, even though their characters look
mostly the same to western eyes.  UTF-8 and Unicode is having a hard
time getting adopted in these countries because most software that
people use deals only with the local encodings.  (Sounds familiar?)
Actually a bigger concern that we hear from our customers in Japan is
that Unicode has *serious* problems in asian languages.  Theey took
the "unification" of Chinese and Japanese, rather than both, and
therefore can not represent los of phrases quite right.  I can have
someone write up a better dscription, but I was told by several
Japanese people that they wouldn't use Unicode come hell or high
water, basically.

Basically it's JJIS, Shift-JIS or nothing for most Japanese
companies.  This was my experience working with Konica a few years ago 
as well.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org