[Python-Dev] unicode hell/mixing str and unicode as dictionary keys

M.-A. Lemburg mal at egenix.com
Thu Aug 3 23:47:10 CEST 2006


Jim Jewett wrote:
> http://mail.python.org/pipermail/python-dev/2006-August/067934.html
> M.-A. Lemburg mal at egenix.com
> 
>> Ralf Schmitt wrote:
>>> Still trying to port our software. here's another thing I noticed:
> 
>>> d = {}
>>> d[u'm\xe1s'] = 1
>>> d['m\xe1s'] = 1
>>> print d
> 
> (a 2-element dictionary, because they are not equal)
> 
>>> With python 2.5 I get: [ a traceback ending in ]
> 
>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1:
>>> ordinal not in range(128)
> 
>> Let's put it this way: Python 2.5 uncovered a bug in your
>> application that has always been there.
> 
> No; he application would only have a bug if he expected those two
> objects to compare equal.  Trying to stick something hashable into a
> dictionary should not raise an Exception just because there is already
> a similar key, (regardless of whether or not the other key is equal or
> identical).

Hmm, you have a point there...

>>> d = {}

# Two different objects
>>> x = 'a'
>>> y = hash(x)
>>> x
'a'
>>> y
12416037344

# ... with the same hash value
>>> hash(x)
12416037344
>>> hash(y)
12416037344

# Put them in the dictionary, causing a hash collision ...
>>> d[x] = 1
>>> d[y] = 2

# ... which is resolved by comparing the two for equality
# and assigning them to two different slots:
>>> d
{'a': 1, 12416037344: 2}

Since Python 2.5 propagates the compare exception, you get the
exception. Python 2.4 silenced the exception.

> The only way this error could be the right thing is if you were trying
> to suggest that he shouldn't mix unicode and bytestrings at all.

Good question. I wonder whether that's a reasonable approach for
Python 2.x (I'd say it is for Py3k).

Currently you can't safely mix non-ASCII string with Unicode
keys in the same dictionary.

Perhaps we ought to add an exception to the dict lookup mechanism
and continue to silence UnicodeErrors ?!

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 03 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list