[issue10542] Py_UNICODE_NEXT and other macros for surrogates

Alexander Belopolsky report at bugs.python.org
Sun Nov 28 02:39:33 CET 2010


Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

I am attaching a patch that defines Py_UNICODE_PUT_NEXT() macro (tentative name) and uses it to fix str.upper method.  The implementation of surrogate-aware str.upper shows that NEXT/PUT_NEXT abstractions may lead to somewhat inefficient code for "by codepoint" processing.  The issue is that once in in the process of reading the codepoint, it is determined whether the code point is BMP or non-BMP.  Testing the result again in order to write it is somewhat wasteful.  I don't think this would matter in practice, but would like to hear alternative opinions before moving further. (Please, don't argue over names - let's figure out the proper semantics first.)

----------
Added file: http://bugs.python.org/file19845/issue10542-put-next.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10542>
_______________________________________


More information about the Python-bugs-list mailing list