[issue9335] LC_CTYPE system setting not respected by setlocale()

STINNER Victor report at bugs.python.org
Mon Jul 26 01:27:21 CEST 2010


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> Victor, This looks like your cup of tee.

Unicode is my cup of tee, but not programs considering that bytes are characters.

<a byte string>.isalpha() doesn't mean anything to me :-)

This issue is a more question about the C library, not about Python :-) So try the attached program "isalpha.c" if you would like to test your libc.

Results on my Linux box (Debian Sid, eglibc 2.11.2):
----------------
$ ./isalpha C
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz (52)

$ ./isalpha fr_FR.UTF-8
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz (52)

$ ./isalpha fr_FR.iso88591
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff (117)

$ ./isalpha fr_FR.iso885915 at euro
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xa6\xa8\xaa\xb4\xb5\xb8\xba\xbc\xbd\xbe\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff (124)
----------------

If your libc consider that \xff is a valid UTF-8 character, you should change your OS for a better one :-)

--

> >>> len(letters)
> 117
> ...
> >>> locale.setlocale(locale.LC_CTYPE)
> 'en_US.UTF-8'

It looks like Mac OS X uses ISO-8859-1 instead of UTF-8.

--

string.letters is built using strop.lowercase + strop.uppsercase which are built using the C functions islower() and islower(). locale.setlocale() regenerates strop/string.lowercase, strop/string.uppercase and string.letters for LC_CTYPE and LC_ALL categories.

--

You don't need to run IDLE or import Tkinter to set the locale:

   import locale; locale.setlocale(locale.LC_ALL, '')

is enough.

--

A library should not change the locale (only the application).

$ python2.6
>>> import locale
>>> locale.getlocale()
(None, None)
>>> import Tkinter
>>> locale.getlocale()
('fr_FR', 'UTF8')

=> Tkinter is an horrible library! (The bug is in the C library, not in the Python wrapper)

Use a better one like Gtk ou Qt ;-)

$ python
>>> import locale
>>> import pygtk
>>> locale.getlocale()
(None, None)
>>> import PyQt4
>>> locale.getlocale()
(None, None)

(IDLE is based on Tkinter)

--

I don't understand why Alexander gets different results on Python 2.6 and Python 2.7.

@belopolsky: Are both programs linked to (built with?) the same C library? (same libray version)

----------
Added file: http://bugs.python.org/file18202/isalpha.c

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9335>
_______________________________________


More information about the Python-bugs-list mailing list