New subject: Unicode and comparisons

April 4, 2000

      ...
Question: is this behaviour acceptable or should I go even further
and mask decoding errors during compares and contains tests too ?
I always thought it is a core property of cmp that it works between
all objects. Because of that,
...
...
...
x=[u'1','aäöü']     
x.sort()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data
fails. As always in cmp, I'd expect to get a consistent outcome here
(ie. cmp should give a total order on objects).

OTOH, I'm not so sure why cmp between plain and unicode strings needs
to perform UTF-8 conversion? IOW, why is it desirable that
...
...
...
'a' == u'a'
1
Anyway, I'm not objecting to that outcome - I only think that, to get
cmp consistent, it may be necessary to drop this result. If it is not
necessary, the better.

Regards,
Martin

Re: Unicode and comparisons

Martin v. Loewis

M.-A. Lemburg

Guido van Rossum

pf＠artcom-gmbh.de

M.-A. Lemburg

M.-A. Lemburg

Guido van Rossum

pf＠artcom-gmbh.de

M.-A. Lemburg

tags

participants (4)