[Python-Dev] UTF-16 code point comparison
Mon, 31 Jul 2000 11:16:56 +0200
Guido van Rossum wrote:
> > Predicting the future can be difficult, but here is my take:
> > javasoft will never change the way String.compareTo works.
> > String.compareTo is documented as:
> > """
> > Compares two strings lexicographically. The comparison is based on
> > the Unicode value of each character in the strings. ...
> > """
> (Noting that their definition of "character" is probably "a 16-bit
> value of type char", and has only fleeting resemblance to what is or
> is not defined as a character by the Unicode standard.)
> > Instead they will mark it as a very naive string comparison and suggest
> > users to use the Collator classes for anything but the simplest cases.
> Without having digested the entire discussion, this sounds like a good
> solution for Python too. The "==" operator should compare strings
> based on a simple-minded representation-oriented definition, and all
> the other stuff gets pushed into separate methods or classes.
This would probably be the best way to go: we'll need
collation routines sooner or later anyway. Bill's "true UCS-4"
compare could then become part of that lib.
Should I #if 0 the current implementation of the UCS-4 compare
in CVS ?
Python Pages: http://www.lemburg.com/python/