[EuroPython] i18n and realted issues
M.-A. Lemburg
mal@lemburg.com
Tue, 29 Apr 2003 18:15:18 +0200
Magnus Lyck=E5 wrote:
> A subject which is typically ignored in python books, and very relevant
> in a python business perspective is localization and internationalizati=
on.
>=20
> Honestly, I haven't tried very hard, but this seems a bit confusing for
> me in Python. (Not that I can say that it's better in other languages.)
>=20
> Is is just me, or do others feel the same?
>=20
> How important is this to people, and what could be done about it? Somet=
hing
> to discuss at Europython 2003? (Are there plans for any BoFs etc.)
We can discuss many things, but would that make any difference ?
Getting this right is a lot of work and I don't see any funding
or interest from volunteers to get any of it done.
> This field contains a number of issues, from translation of messages to
> input and output of data formatted according to locale, and issues like
> Unicode and right-to-left text etc. It seems to me that this is far fro=
m
> ideal today. At least in Windows 2000, locale seems buggy:
>
> >>> locale.getdefaultlocale()
> ('sv_SE', 'cp1252')
> >>> locale.setlocale(locale.LC_ALL, '')
> 'Swedish_Sweden.1252'
> >>> locale.getlocale()
> ['Swedish_Sweden', '1252']
>=20
> You see, they aren't the same! This leads to:
You're mixing character sets with locales here.
> >>> locale.resetlocale()
> Error: locale setting not supported
>=20
> Also, the example from the manual breaks:
> >>> locale.setlocale(locale.LC_ALL, 'de')
> Error: locale setting not supported
That's probably because your Windows version doesn't support the
German locale. No surprise here :-) It should work on Linux which
usually comes with all sorts of locale information.
> Also note, that "some string".decode(locale.getlocale()[1]) won't work
> in Windows, but "some string".decode(locale.getdefaultlocale()[1])
> will work. There is no code page called just '1252'. Certainly confusin=
g
> and non-obvious to me. Not fun if we try to use non-default settings.
The codec registry only knows about "cp1252" because that's
the standard name. We can't go about and add all possible
aliases for each and every encoding out there.
> How do we display dates and times according to locale? Does the new
> date module handle that? locale.atof and locale.format can at least
> display floats right. (I think.)
>=20
> To the extent that the code is there, it's just briefly described in th=
e
> docs, and very little in all the Python books out there. Is this really
> such a peripheral issue?
Probably not too interesting to the US folks :-) Everybody
else seems to be using their own little tool sets for this.
> What about unicode and locale. They don't seem to get along extremely
> well today... For instance is seems x.sort(locale.strcoll) can't handle
> Unicode strings.=20
Right, collation support is still missing from the Unicode
implementation.
> Ok, in cas of doubt, refuse the temptation to guess. No
> one locale defines collation for all of unicode, but there will be more
> and more cases where we want to sort names from all over the world, wit=
h
> at least accents in place. How?
Using the collation support defined in the Unicode
standard (provided that someone writes the support code
needed for the Python implementation).
> locale.nl_langinfo is only available on some platforms... Etc etc.
--=20
Marc-Andre Lemburg
eGenix.com
Professional Python Software directly from the Source (#1, Apr 29 2003)
>>> Python/Zope Products & Consulting ... http://www.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium: 56 days left