Want to modify Python to support BIG5 characters in string

Chris Tavares christophertavares at earthlink.net
Sun Sep 23 16:43:22 EDT 2001


"Tiberius Teng" <tiberius at ms28.hinet.net> wrote in message
news:6c92a8bc.0109222241.1ca97971 at posting.google.com...
> I want to do some modification to make Python support BIG5 (Traditional
> Chinese) characters in string, because I don't use Unicode and Python
> doesn't like those '\' (backslashes) in string literals.
>
> BIG5 characters have two bytes, the first byte is from 0xA1 to 0xF9, and
the
> second is from 0x40 to 0x7E and 0xA1 to 0xFE, which includes the '\'
> backslash character. So my plan is add a few line to make Python ignore
the
> following backslashes if it encounters the first byte in BIG5.
>
> I have Visual C++ 6 installed, and downloaded Python 2.1.1 source tree. I
> can compile it out-of-box so modify/compiling should not have problem. I
> just can't find out where should I modify ... Can somebody help me ?
>

Why don't you want to use unicode? Dealing with multibyte character encoding
is a real pain in the behind. Usually the best solution is to write an
encoder/decoder to convert BIG5 from/to unicode. Then you can read in your
BIG5 files, process them internally in Unicode, and the write them back out
in BIG5.

-Chris







More information about the Python-list mailing list