[Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

"Martin v. Löwis" martin at v.loewis.de
Tue Aug 8 21:00:26 CEST 2006


M.-A. Lemburg schrieb:
> If the programmer writes:
> 
> x = 'äöü'
> y = u'äöü'
> ...
> if x == y:
>     do_something()
> 
> then he clearly has had the intention to compare two character
> strings.

Programmers make all kinds of mistakes when comparing objects,
assuming that things ought to be equal that actually aren't:

py> 1.6/math.pi*math.pi == 1.6
False
py> if 10*10 is 100:
...   print "yes"
... else:
...   print "no"
...
no

> Now, if what you were saying were true, then the above would
> simply continue to work without raising an exception, possibly
> causing the application to return wrong results.

That correct. It is a programming mistake, hence you get a wrong
result. However, you cannot assume that every comparison between
a string and a Unicode object is always a programming mistake.
You must not raise exceptions just because of a *potential*
programming mistake; that's what PyChecker is there for.

> Note that we are not discussing changing the behavior of the
> __eq__ comparison between strings and Unicode, since this has
> always been to raise exceptions in case the automatic propagation
> fails.

Not sure what you are discussing: This is *precisely* what I'm
discussing. Making that change would solve this problem.

> The discussion is about silencing exceptions in the dict lookup
> mechanism - something which used to happen and now no longer
> is done.

No, that's not what the discussion is about. The discussion
is about the backwards incompatibility in Python 2.5 wrt.
Python 2.4. There are several ways to solve that; silencing
the exception is just one way.

I think it is the wrong way, as I think that
string-unicode-comparison should have a consistent behaviour
no matter where the comparison occurs.

> Since this behavior is an implementation detail of the
> dictionary implementation, users perceive this change as random
> exceptions occurring in their application.

That key comparison occurs is *not* an implementation detail.
It is a necessary and documented aspect of the dictionary
lookup.

> I've suggested to go about this in a slightly more user-friendly
> way, namely by giving a warning instead of raising an exception
> in Python 2.5 and then going for the exception in Python 2.6.

Yes, and I have suggested to make it even more user-friendly
by defining string-unicode-__eq__ in a sensible manner. It
is more user-friendly, because it doesn't show the inconsistency
Michael Hudson documented in

http://mail.python.org/pipermail/python-dev/2006-August/067981.html

Regards,
Martin



More information about the Python-Dev mailing list