[Python-Dev] Re: [I18n-sig] Unicode strings: an alternative
Tom Emerson
tree@basistech.com
Fri, 5 May 2000 07:46:41 -0400 (EDT)
Just van Rossum writes:
> At 10:07 AM +0100 05-05-2000, Toby Dickenson wrote:
> >One other pleasant consequence:
> >
> >- String comparisons work character-by character, even if the
> > representation of those characters have different widths.
>
> Exactly. By saying "(wide) strings are not tied to Unicode" the question
> whether wide strings should or should not be sorted according to the
> Unicode spec is answered by a simple "no", instead of "hmm, maybe, but it's
> too hard anyway"...
Wait a second.
There is nothing about Unicode that would prevent you from defining
string equality as byte-level equality.
This strikes me as the wrong way to deal with the complex collation
issues of Unicode.
It seems to me that by default wide-strings compare at the byte-level
(i.e., '=' is a byte level comparison). If you want a normalized
comparison, then you make an explicit function call for that.
This is no different from comparing strings in a case sensitive
vs. case insensitive manner.
-tree
--
Tom Emerson Basis Technology Corp.
Language Hacker http://www.basistech.com
"Beware the lollipop of mediocrity: lick it once and you suck forever"