International characters and Python's string functions on Linux

John Max Skaller skaller at maxtal.com.au
Thu Aug 5 17:39:07 EDT 1999


On Thu, 5 Aug 1999 15:37:13 +0200, earlybird at mop.no (Alexander Staubo) wrote:

>I'm running Python on Slackware Linux, and it does not seem to provide a 
>full international character set in the string.uppercase and 
>string.lowercase constants: only A-Z, which isn't enough.
>
>Having studied the string module code a bit, I realize that Python is at 
>the mercy of the C API and its string/locale functions, so I'm thinking I 
>must probably configure Linux for a specific locale. Right? If so, how?

	Wrong. Python doesn't use C locale functions.
Also, it doesn't support ISO-10646 directly. If you wish to support
any internationalisation, the recommended approach is to conform
to ISO-10646, which is an International Standard.

	You can write your own code now, and you can borrow
mine: see

	http://www.triode.net.au/~skaller/unicode/index.html

for documented sources. 

	In Python today, you can support
internationalisation using the UTF-8 encoding, as my 
literate programming tool interscript does. UTF-8 is,
in my opinion, the best option in Python today, since it
is ASCII compatible, and will work with 8 bit strings as 
Python today already has.

	The LAST thing you should do is give any
internal support to ISO-8859-x encodings such as Latin-1
etc, which are NOT UTF-8 compatible, nor are
they compatible with each other. In other words
do NOT use any single (or double) byte international 
character set encodings. Use UTF-8, or use arrays
of integers representing the 31 bit UCS-4 
(direct) encoding of ISO-10646.

	By the way, UTF-8 is the official
Linux encoding. Try 

	man utf-8

and replace the incorrect word 'unicode' with
ISO-10646 in the man page: UTF-8 provides
a full encoding of all 2^31 codepoints.
[Unicode is the first 2^16 of them]

John Max Skaller                ph:61-2-96600850              
mailto:skaller at maxtal.com.au       10/1 Toxteth Rd 
http://www.maxtal.com.au/~skaller  Glebe 2037 NSW AUSTRALIA




More information about the Python-list mailing list