[XML-SIG] Re: Issues with Unicode type
Tue, 24 Sep 2002 16:34:16 +0200
> I've just added a note to the docs for Python 2.2.2 and 2.3 that len()
> returns the number of storage units, not abstract characters.=20
imo (as the original author of the unicode type), that's an =
artifact, not a feature.
> I don't expect that to change given that it's been doing it that way =
> the Unicode type was introduced.
the original Unicode type used UCS-2 for internal storage, and all =
operations worked on code points.
adding UTF-16 support in a couple of places doesn't really change that;
an UTF-16-encoded unicode string should be treated just like an encoded
8-bit string -- standard string operations are not guaranteed to work on
(if we document all bugs and half-baked solutions as supported features,
we will never be able to fix anything...)