[Python-3000] String comparison
Rauli Ruohonen
rauli.ruohonen at gmail.com
Thu Jun 14 15:51:09 CEST 2007
On 6/13/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> except that people will sneak in some UTF-16 behavior where it seems useful.
How about sneaking these in py3k-struni:
- chr(i) returns a len-1 or len-2 string for all i in range(0, 0x110000) and
ord(chr(i)) == i for all i in range(0, 0x110000)
- unicodedata.name(chr(i)) returns the same result for all i on both UCS-2
and UCS-4 builds (and same for bidirectional(), category(), combining(),
decimal(), decomposition(), digit(), east_asian_width(), mirrored() and
numeric() in unicodedata)
- return len-1 or len-2 strings on unicodedata.lookup(), instead of always
len-1 strings (e.g. unicodedata.lookup('AEGEAN WORD SEPARATOR LINE')
returns '\u0100' on UCS-2 builds, but '\U00010100' on UCS-4 builds)
- unicodedata.normalize(s) interprets its input as UTF-16 on UCS-2 builds
- use ValueError instead of TypeError in the above when passed an
inappropriate string, e.g. ord('aa')
Any chances?
More information about the Python-3000
mailing list