[Python-Dev] unicode hell/mixing str and unicode as dictionary keys

Bob Ippolito bob at redivi.com
Fri Aug 4 12:50:30 CEST 2006


On Aug 3, 2006, at 9:34 PM, Josiah Carlson wrote:

>
> Bob Ippolito <bob at redivi.com> wrote:
>> On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote:
>>
>>> M.-A. Lemburg wrote:
>>>
>>>> Perhaps we ought to add an exception to the dict lookup mechanism
>>>> and continue to silence UnicodeErrors ?!
>>>
>>> Seems to be that comparison of unicode and non-unicode
>>> strings for equality shouldn't raise exceptions in the
>>> first place.
>>
>> Seems like a slightly better idea than having dictionaries suppress
>> exceptions. Still not ideal though because sticking non-ASCII strings
>> that are supposed to be text and unicode in the same data structures
>> is *probably* still an error.
>
> If/when 'python -U -c "import test.testall"' runs without unexpected
> error (I doubt it will happen prior to the "all strings are unicode"
> conversion), then I think that we can say that there aren't any
> use-cases for strings and unicode being in the same dictionary.
>
> As an alternate idea, rather than attempting to .decode('ascii') when
> strings and unicode compare, why not .decode('latin-1')?  We lose the
> unicode decoding error, but "the right thing" happens (in my opinion)
> when u'\xa1' and '\xa1' compare.

Well, in this case it would cause different behavior if u'\xa1' and  
'\xa1' compared equal. It'd just be an even more subtle error.

-bob



More information about the Python-Dev mailing list