[Python-Dev] unicode hell/mixing str and unicode as dictionary keys

Ralf Schmitt ralf at brainbot.com
Thu Aug 3 17:40:57 CEST 2006


Still trying to port our software. here's another thing I noticed:

d = {}
d[u'm\xe1s'] = 1
d['m\xe1s'] = 1
print d

With python 2.4 I can add those two keys to the dictionary and get:
$ python2.4 t2.py
{u'm\xe1s': 1, 'm\xe1s': 1}

With python 2.5 I get:

$ python2.5 t2.py
Traceback (most recent call last):
   File "t2.py", line 3, in <module>
     d['m\xe1s'] = 1
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: 
ordinal not in range(128)

Is this intended behaviour? I guess this might break lots of programs 
and the way python 2.4 works looks right to me.
I think it should be possible to mix str/unicode keys in dicts and let 
non-ascii strings compare not-equal to any unicode string.

At least it should be documented prominently in the "what's new" document.

- Ralf




More information about the Python-Dev mailing list