[Python-Dev] UCS2/UCS4 default
Armin Ronacher
armin.ronacher at active-4.com
Thu Jul 3 18:30:09 CEST 2008
Guido van Rossum <guido <at> python.org> writes:
> The one thing that may be missing from Python is things like
> interpretation of surrogates by functions like isalpha() and I'm okay
> with adding that (since those have to loop over the entire string
> anyway).
That and methods to safely iterate and slice strings by codepoint. Java
supports that via String.codePointCount / String.codePointAt /
String.codePointBefore / String.offsetByCodepoints. Maybe not on the
unicode/str object itself but as part of unicodedata that would make sense
for applications that have to deal with unicode on that level.
Regards,
Armin
More information about the Python-Dev
mailing list