[Python-Dev] UTF-16 code point comparison

Guido van Rossum guido@beopen.com
Fri, 28 Jul 2000 07:26:07 -0500


> Predicting the future can be difficult, but here is my take:
> javasoft will never change the way String.compareTo works.  
> String.compareTo is documented as:
> """
>   Compares two strings lexicographically. The comparison is based on 
>   the Unicode value of each character in the strings. ...
> """

(Noting that their definition of "character" is probably "a 16-bit
value of type char", and has only fleeting resemblance to what is or
is not defined as a character by the Unicode standard.)

> Instead they will mark it as a very naive string comparison and suggest
> users to use the Collator classes for anything but the simplest cases.

Without having digested the entire discussion, this sounds like a good
solution for Python too.  The "==" operator should compare strings
based on a simple-minded representation-oriented definition, and all
the other stuff gets pushed into separate methods or classes.

--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)