[Python-Dev] Re: other "magic strings" issues

"Martin v. Löwis" martin at v.loewis.de
Tue Nov 11 14:25:53 EST 2003


Guido van Rossum wrote:
>>The locale module has some things in this direction -- strxfrm and
>>strcoll, maybe? -- but I don't know what they do with unicode & doubt
>>they even exist on OS X.
> 
> 
> IMO, locale and Unicode shouldn't be mentioned in the same sentence.
> At least the part of the locale that defines properties of characters
> is subsumed in Unicode in a way that doesn't require you to specify
> the locale.  (Of course the locale is still important in defining
> things like conventions for formatting numbers and dates.)

In particular, locale also matters for collation. So the desire to
collate Unicode strings properly is reasonable, but you need to know
what locale to use for collation. With Python's current locale model,
one would convert the Unicode string to the locale's encoding, and
then perform collation.

Of course, with an ICU wrapper, you could have multiple simultaneous
locales, and collate Unicode strings without converting them into byte
strings first.

http://cvs.sourceforge.net/viewcvs.py/python-codecs/picu/

Regards,
Martin




More information about the Python-Dev mailing list