[Python-Dev] Re: other "magic strings" issues

Martin v. Löwis martin at v.loewis.de
Wed Nov 12 18:09:51 EST 2003


David Eppstein <eppstein at ics.uci.edu> writes:

> It does?

Sure:

Python 2.3 (#26, Aug  1 2003, 09:50:29)
[GCC 3.3 20030226 (prerelease) (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL,"")
'LC_CTYPE=de_DE at euro;LC_NUMERIC=de_DE at euro;LC_TIME=de_DE at euro;LC_COLLATE=C;LC_MONETARY=de_DE at euro;LC_MESSAGES=de_DE at euro;LC_PAPER=de_DE at euro;LC_NAME=de_DE at euro;LC_ADDRESS=de_DE at euro;LC_TELEPHONE=de_DE at euro;LC_MEASUREMENT=de_DE at euro;LC_IDENTIFICATION=de_DE at euro'
>>> locale.strcoll(u"universit\xe4t",u"University")
32
>>> locale.setlocale(locale.LC_ALL,"en_US")
'en_US'
>>> locale.strcoll(u"universit\xe4t",u"University")
-24

> Even if locale would allow me to set a locale, which locale should I
> set, in order to allow all unicodes (not just e.g. iso-8859-1, but all
> of them) to be collated in some reasonable order?

Define "reasonable order". There is no "reasonable order" independent
of the language. In German, it is just not reasonable to have Japanese
characters. Most Germans cannot tell Katakana from Hiragana, so it
just does not matter to them how those collate. Likewise, I guess most
Japanese won't see a difference between an umlaut and a circumflex.

Regards,
Martin



More information about the Python-Dev mailing list