[Python-Dev] UCS2/UCS4 default

Armin Ronacher armin.ronacher at active-4.com
Thu Jul 3 18:30:09 CEST 2008


Guido van Rossum <guido <at> python.org> writes:

> The one thing that may be missing from Python is things like
> interpretation of surrogates by functions like isalpha() and I'm okay
> with adding that (since those have to loop over the entire string
> anyway).
That and methods to safely iterate and slice strings by codepoint.  Java
supports that via String.codePointCount / String.codePointAt /
String.codePointBefore / String.offsetByCodepoints.  Maybe not on the
unicode/str object itself but as part of unicodedata that would make sense
for applications that have to deal with unicode on that level.

Regards,
Armin



More information about the Python-Dev mailing list