[Python-Dev] unicode hell/mixing str and unicode as dictionary keys

"Martin v. Löwis" martin at v.loewis.de
Tue Aug 8 02:29:48 CEST 2006


Armin Rigo schrieb:
> I also seem to remember that TypeErrors should only signal ordering
> non-sense, not equality.  In this case, I'm on the opinion that unicode
> objects and completely-unrelated strings of random bytes should
> successfully compare as unequal, but I'm not enough of a unicode user to
> be sure.

I believe this was the original intent for raising TypeErrors here in
the first place: string-vs-unicode comparison predates rich comparisons,
and there is no way to implement __cmp__ meaningfully if the strings
don't convert successfully under the system encoding (if they are
inequal, you wouldn't be able to tell which one is smaller).

With rich comparisons available, I see no reason to keep raising that
exception.

As for unicode users: As others have said, they should avoid mixing
unicode and ascii strings. We provide a fallback for a limited case
(ascii); beyond that, Python assumes that non-ascii strings represent
uninterpreted bytes, not characters.

Regards,
Martin


More information about the Python-Dev mailing list