[New-bugs-announce] [issue1618] locale.strxfrm can't handle non-ascii strings

Filip Salomonsson report at bugs.python.org
Thu Dec 13 22:41:48 CET 2007


New submission from Filip Salomonsson:

locale.strxfrm currently does not handle non-ascii strings:

$ ./python
Python 3.0a2 (py3k:59482, Dec 13 2007, 21:27:14) 
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_COLLATE, "en_US.utf8")
'en_US.utf8'
>>> locale.strxfrm("a")
'\x0c\x01\x08\x01\x02'
>>> locale.strxfrm("\N{LATIN SMALL LETTER A WITH DIAERESIS}")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: strxfrm() argument 1 must be string without null bytes, not str

The attached patch tries to fix this:

$ ./python
Python 3.0a2 (py3k:59482M, Dec 13 2007, 21:58:09) 
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_COLLATE, "en_US.utf8")
'en_US.utf8'
>>> locale.strxfrm("a")
'.\x01\x10\x01\x02'
>>> locale.strxfrm("\N{LATIN SMALL LETTER A WITH DIAERESIS}")
'.\x01\x19\x01\x02'
>>> alist = list("aboåäöABOÅÄÖñÑ")
>>> sorted(alist, cmp=locale.strcoll) == sorted(alist, key=locale.strxfrm)
True


The patch does not include what's needed to define HAVE_WCSXFRM, since I
really don't know how to do that properly (I edited 'configure' and
'pyconfig.h.in' manually to compile it).

----------
files: strxfrm-unicode.diff
messages: 58592
nosy: filips
severity: normal
status: open
title: locale.strxfrm can't handle non-ascii strings
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file8946/strxfrm-unicode.diff

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1618>
__________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: strxfrm-unicode.diff
Type: text/x-patch
Size: 1589 bytes
Desc: not available
Url : http://mail.python.org/pipermail/new-bugs-announce/attachments/20071213/a8d10ae2/attachment-0001.bin 


More information about the New-bugs-announce mailing list