[Python-Dev] UTF-16 code point comparison
Guido van Rossum
guido@beopen.com
Fri, 28 Jul 2000 07:26:07 -0500
> Predicting the future can be difficult, but here is my take:
> javasoft will never change the way String.compareTo works.
> String.compareTo is documented as:
> """
> Compares two strings lexicographically. The comparison is based on
> the Unicode value of each character in the strings. ...
> """
(Noting that their definition of "character" is probably "a 16-bit
value of type char", and has only fleeting resemblance to what is or
is not defined as a character by the Unicode standard.)
> Instead they will mark it as a very naive string comparison and suggest
> users to use the Collator classes for anything but the simplest cases.
Without having digested the entire discussion, this sounds like a good
solution for Python too. The "==" operator should compare strings
based on a simple-minded representation-oriented definition, and all
the other stuff gets pushed into separate methods or classes.
--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)