[Python-Dev] decoding errors when comparing strings

Guido van Rossum guido@beopen.com
Tue, 25 Jul 2000 22:11:05 -0500


> (revisiting an old thread on mixed string comparisions)

I think it's PEP time for this one...

> summary: the current interpreter throws an "ASCII decoding
> error" exception if you compare 8-bit and unicode strings, and
> the 8-bit string happen to contain a character in the 128-255
> range.

Doesn't bother me at all.  If I write a user-defined class that raises
an exception in __cmp__ you can get the same behavior.  The fact that
the hashes were the same is a red herring; there are plenty of values
with the same hash that aren't equal.

I see the exception as a useful warning that the program isn't
sufficiently Unicode aware to work correctly.  That's a *good* thing
in my book -- I'd rather raise an exception than silently fail.

Note that it can't break old code unless you try to do new things with
the old code: the old code coudn't have supported Unicode because it
doesn't exist in Python 1.5.2.

--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)