[Python-Dev] decoding errors when comparing strings
Tim Peters
tim_one@email.msn.com
Wed, 26 Jul 2000 04:09:27 -0400
[Guido]
> ...
> I see the exception as a useful warning that the program isn't
> sufficiently Unicode aware to work correctly. That's a *good* thing
> in my book -- I'd rather raise an exception than silently fail.
[Fredrik Lundh]
> I assume that means you're voting for alternative 3:
>
> "a third alternative would be to keep the exception, and make
> the dictionary code exception proof."
>
> because the following isn't exactly good behaviour:
>
> >>> a = "„"
> >>> b = unicode(a, "iso-8859-1")
> >>> d = {}
> >>> d[a] = "a"
> >>> d[b] = "b"
> >>> len(d)
> UnicodeError: ASCII decoding error: ordinal not in range(128)
> >>> len(d)
> 2
>
> (in other words, the dictionary implementation misbehaves if items
> with the same hash value cannot be successfully compared)
Hmm. That's a bug in the dict implementation that's independent of Unicode
issues, then -- and I can provoke similar behavior with classes that raise
exceptions from __cmp__, without using Unicode anywhere. So, ya, the dict
bugs have to be fixed. Nobody needs to vote on *that* part <wink>. I'll
look into it "soon", unless somebody else does first.