[Python-Dev] Re: other "magic strings" issues

Guido van Rossum guido at python.org
Tue Nov 11 14:56:35 EST 2003


> >>The locale module has some things in this direction -- strxfrm and
> >>strcoll, maybe? -- but I don't know what they do with unicode & doubt
> >>they even exist on OS X.
> > 
> Guido van Rossum wrote:
> > IMO, locale and Unicode shouldn't be mentioned in the same sentence.
> > At least the part of the locale that defines properties of characters
> > is subsumed in Unicode in a way that doesn't require you to specify
> > the locale.  (Of course the locale is still important in defining
> > things like conventions for formatting numbers and dates.)

[MvL]
> In particular, locale also matters for collation. So the desire to
> collate Unicode strings properly is reasonable, but you need to know
> what locale to use for collation. With Python's current locale model,
> one would convert the Unicode string to the locale's encoding, and
> then perform collation.

Ouch.  Seems you're right.

> Of course, with an ICU wrapper, you could have multiple simultaneous
> locales, and collate Unicode strings without converting them into byte
> strings first.
> 
> http://cvs.sourceforge.net/viewcvs.py/python-codecs/picu/

Is that something we could move into the std lib?

--Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list