[Cython] About IndexNode and unicode[index]

Zaur Shibzukhov szport at gmail.com
Fri Mar 1 10:46:39 CET 2013


2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
> ZS, 28.02.2013 21:07:
>> 2013/2/28 Stefan Behnel:
>>>> This allows to write unicode text parsing code almost at C speed
>>>> mostly in python (+ .pxd defintions).
>>>
>>> I suggest simply adding a constant flag argument to the existing function
>>> that states if checking should be done or not. Inlining will let the C
>>> compiler drop the corresponding code, which may or may nor make it a little
>>> faster.
>>
>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>     Py_ssize_t length;
>> #if CYTHON_PEP393_ENABLED
>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>> #endif
>>     if (flag) {
>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>         if ((0 <= i) & (i < length)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>         } else if ((-length <= i) & (i < 0)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>         } else {
>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>             return (Py_UCS4)-1;
>>         }
>>     } else {
>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>     }
>> }
>
> I think you could even pass in two flags, one for wraparound and one for
> boundscheck, and then just evaluate them appropriately in the existing "if"
> tests above. That should allow both features to be supported independently
> in a fast way.
>
>
>> Here are timings:
>>
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_1" "test_1()"
>> 50 loops, best of 5: 152 msec per loop
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_2" "test_2()"
>> 50 loops, best of 5: 86.5 msec per loop
>> (py33) zbook:mytests $ python3.3 -m timeit -n 50 -r 5 -s "from
>> mytests.unicode_index import test_3" "test_3()"
>> 50 loops, best of 5: 86.5 msec per loop
>>
>> So your suggestion would be preferable.
>
> Nice. Yes, looks like it' worth it.
>
Could I help in order to include this in 19.0?

Zaur Shibzukhov


More information about the cython-devel mailing list