Undeterministic strxfrm?

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Tue Sep 4 15:13:17 EDT 2007


En Tue, 04 Sep 2007 07:34:54 -0300, Tuomas <tuomas.vesterinen at pp.inet.fi>  
escribi�:

> Python 2.4.3 (#3, Jun  4 2006, 09:19:30)
> [GCC 4.0.0 20050519 (Red Hat 4.0.0-8)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import locale
>  >>> def key(s):
> ...     locale.setlocale(locale.LC_COLLATE, 'en_US.utf8')
> ...     return locale.strxfrm(s.encode('utf8'))
> ...
>  >>> first=key(u'maupassant guy')
>  >>> first==key(u'maupassant guy')
> False
>  >>> first
> '\x18\x0c \x1b\x0c\x1e\x1e\x0c\x19\x1f\x12
> $\x01\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x01\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x01\xf5\xb79'
>  >>> key(u'maupassant guy')
> '\x18\x0c \x1b\x0c\x1e\x1e\x0c\x19\x1f\x12
> $\x01\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x01\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x01\xb5'
>  >>>
>
> May be this is enough for a sort order but I need to be able to catch
> equals too. Any hints/explanations?

I can't use your same locale, but with my own locale settings, I get  
consistent results:

Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit  
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
py> import locale
py> locale.setlocale(locale.LC_COLLATE, 'Spanish_Argentina')
'Spanish_Argentina.1252'
py> def key(s):
...   return locale.strxfrm(s.encode('utf8'))
...
py> first=key(u'maupassant guy')
py> print repr(first)
'\x0eQ\x0e\x02\x0e\x9f\x0e~\x0e\x02\x0e\x91\x0e\x91\x0e\x02\x0ep\x0e\x99\x07\x02
\x0e%\x0e\x9f\x0e\xa7\x01\x01\x01\x01'
py> print repr(key(u'maupassant guy'))
'\x0eQ\x0e\x02\x0e\x9f\x0e~\x0e\x02\x0e\x91\x0e\x91\x0e\x02\x0ep\x0e\x99\x07\x02
\x0e%\x0e\x9f\x0e\xa7\x01\x01\x01\x01'
py> print first==key(u'maupassant guy')
True

Same thing with Python 2.4.4

-- 
Gabriel Genellina




More information about the Python-list mailing list