[Python-Dev] PEP 393 Summer of Code Project
Victor Stinner
victor.stinner at haypocalc.com
Wed Aug 24 10:17:58 CEST 2011
Le 24/08/2011 04:41, Torsten Becker a écrit :
> On Tue, Aug 23, 2011 at 18:27, Victor Stinner
> <victor.stinner at haypocalc.com> wrote:
>> I posted a patch to re-add it:
>> http://bugs.python.org/issue12819#msg142867
>
> Thank you for the patch! Note that this patch adds the fast path only
> to the helper function which determines the length of the string and
> the maximum character. The decoding part is still without a fast path
> for ASCII runs.
Ah? If utf8_max_char_size_and_has_errors() returns no error hand
maxchar=127: memcpy() is used. You mean that memcpy() is too slow? :-)
maxchar = utf8_max_char_size_and_has_errors(s, size, &unicode_size,
&has_errors);
if (has_errors) {
...
}
else {
unicode = (PyUnicodeObject *)PyUnicode_New(unicode_size, maxchar);
if (!unicode) return NULL;
/* When the string is ASCII only, just use memcpy and return. */
if (maxchar < 128) {
assert(unicode_size == size);
Py_MEMCPY(PyUnicode_1BYTE_DATA(unicode), s, unicode_size);
return (PyObject *)unicode;
}
...
}
But yes, my patch only optimize ASCII only strings, not "mostly-ASCII"
strings (e.g. 100 ASCII + 1 latin1 character). It can be optimized
later. I didn't benchmark my patch.
Victor
More information about the Python-Dev
mailing list