[I18n-sig] Re: [Python-Dev] Unicode debate

Sin Hang Kin kentsin@poboxes.com
Sat, 29 Apr 2000 11:07:12 +0800


I am not quite follow on the discussion. But I am interested in Unicode-ify
python:

Python should be able to be an native language of any language. For given
all  nations a fair ground for computer programming. The recently
english-oriented python syntax should be easily ported to other languages
and python programs written in all languages can be converted to another one
automatically. i.e., a french speaking children can use french command words
to write python code, and this python code can convert to Englihs, Chinese,
...

Backward compatibility is a must. The current implementation of unicode
string might break some code. The ability to convert from/to unicode is not
enough. For example, it might for a search engine to collect many text from
different encoding, and I have seen that mixed encoding in a single text. I
did it once with in a Chinese application, I received a collective text file
which someone who collect them from mainland China with GB encoding and
locally with Big-5 encoding. The one who collect them do not read them
carefully, and he got a mighty environment (richwin) which automatically
recognize the encoding and adapt to it. So he just paste all these text
together. With such an mixed text, no conversion to/from unicode handling is
able to handle. Think if you run a mailing list, one like this, with people
quoting each other's message and write in their native encoding, you will
get a funny text collection with different encoding. This also can happen to
the digest of such an mailing list: you may try now writing in all encoding
:)

So, I perfer to have people choosing their encoding. Setting a flag inside a
program will switch the internal handling of utf-8, 8-bit code. With time
pass, we may drop that, but now, we can not abandom the 8-bit code.

Rgs,

Kent Sin