[Python-Dev] Re: [I18n-sig] Unicode strings: an alternative
Tom Emerson
tree@basistech.com
Fri, 5 May 2000 08:34:35 -0400 (EDT)
Just van Rossum writes:
> Good point. All this taken together still means to me that comparisons
> between wide and narrow strings should take place at the character level,
> which implies that coercion from narrow to wide is done at the character
> level, without looking at the encoding. (Which in my book in turn still
> implies that as long as we're talking about Unicode, narrow strings are
> effectively Latin-1.)
Only true if "wide" strings are encoded in UCS-2 or UCS-4. If "wide
characters" are Unicode, but stored in UTF-8 encoding, then you loose.
Hmmmm... how often do you expect to compare narrow vs. wide strings,
using default comparison (i.e. = or !=)? What if I'm using Latin 3 and
use the byte comparison? I may very well have two strings (one narrow,
one wide) that compare equal, even though they're not. Not exactly
what I would expect.
-tree
[I'm flying from Seattle to Boston today, so eventually I will
disappear for a while]
--
Tom Emerson Basis Technology Corp.
Language Hacker http://www.basistech.com
"Beware the lollipop of mediocrity: lick it once and you suck forever"