[Python-Dev] Re: Unicode and comparisons

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Tue, 4 Apr 2000 17:44:17 +0200


> Question: is this behaviour acceptable or should I go even further
> and mask decoding errors during compares and contains tests too ?

I always thought it is a core property of cmp that it works between
all objects. Because of that,

>>> x=[u'1','aäöü']     
>>> x.sort()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data

fails. As always in cmp, I'd expect to get a consistent outcome here
(ie. cmp should give a total order on objects).

OTOH, I'm not so sure why cmp between plain and unicode strings needs
to perform UTF-8 conversion? IOW, why is it desirable that

>>> 'a' == u'a'
1

Anyway, I'm not objecting to that outcome - I only think that, to get
cmp consistent, it may be necessary to drop this result. If it is not
necessary, the better.

Regards,
Martin