Can Python be used for UTF/Double Bytes/Asian Characters?

Martin von Loewis loewis at informatik.hu-berlin.de
Tue Jun 5 04:15:49 EDT 2001


David LeBlanc <whisper at oz.nospamnet> writes:


> I don't know about DBCS or Asian characters, but Unicode is a standard 
> feature in Python 2.0 (and maybe also Python 2.0, but i'm only sure about 
> 2.1). 

To process DBCS or Asian characters, you need a codec that converts
your encoding into Unicode, then proceed with processing Unicode
strings.  Python 2.0 does not include any DBCS codecs (unless you
count UTF-16 as DBCS also), but there are a few codecs available in
the CVS of http://sourceforge.net/projects/python-codecs.

> I don't know if the regular expression stuff has been made unicode 
> aware or not. 

It is unicode aware since Python 1.6.

Regards,
Martin



More information about the Python-list mailing list