[Python-Dev] Re: Unicode and comparisons
Tue, 4 Apr 2000 23:14:59 +0200 (MEST)
Guido van Rossum:
> > I always thought it is a core property of cmp that it works between
> > all objects.
> Not any more. Comparisons can raise exceptions -- this has been so
> since release 1.5. This is rarely used between standard objects, but
> not unheard of; and class instances can certainly do anything they
> want in their __cmp__.
Python 1.6a1 (#6, Apr 2 2000, 02:32:06) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> a = '1'
>>> b = 2
>>> a < b
>>> a > b # Newbies are normally baffled here
>>> a = 'ä'
>>> b = u'ä'
>>> a < b
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: unexpected end of data
IMO we will have a *very* hard to time to explain *this* behaviour
Unicode objects are similar to normal string objects from the users POV.
It is unintuitive that objects that are far less similar (like for
example numbers and strings) compare the way they do now, while the
attempt to compare an unicode string with a standard string object
containing the same character raises an exception.
Mit freundlichen Grüßen (Regards), Peter
(BTW: using an 12year old US keyboard and a custom xmodmap all the time
to write umlauts lots of other interisting chars: ÷× ± ²³ ½¼ ° µ «» ¿? ¡! ;-)