[Cython] About IndexNode and unicode[index]

ZS szport at gmail.com
Fri Mar 1 07:43:34 CET 2013


2013/3/1 Stefan Behnel <stefan_ml at behnel.de>:
> ZS, 28.02.2013 21:07:
>> 2013/2/28 Stefan Behnel:
>>>> This allows to write unicode text parsing code almost at C speed
>>>> mostly in python (+ .pxd defintions).
>>>
>>> I suggest simply adding a constant flag argument to the existing function
>>> that states if checking should be done or not. Inlining will let the C
>>> compiler drop the corresponding code, which may or may nor make it a little
>>> faster.
>>
>> static inline Py_UCS4 unicode_char2(PyObject* ustring, Py_ssize_t i, int flag) {
>>     Py_ssize_t length;
>> #if CYTHON_PEP393_ENABLED
>>     if (PyUnicode_READY(ustring) < 0) return (Py_UCS4)-1;
>> #endif
>>     if (flag) {
>>         length = __Pyx_PyUnicode_GET_LENGTH(ustring);
>>         if ((0 <= i) & (i < length)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>         } else if ((-length <= i) & (i < 0)) {
>>             return __Pyx_PyUnicode_READ_CHAR(ustring, i + length);
>>         } else {
>>             PyErr_SetString(PyExc_IndexError, "string index out of range");
>>             return (Py_UCS4)-1;
>>         }
>>     } else {
>>         return __Pyx_PyUnicode_READ_CHAR(ustring, i);
>>     }
>> }
>
> I think you could even pass in two flags, one for wraparound and one for
> boundscheck, and then just evaluate them appropriately in the existing "if"
> tests above. That should allow both features to be supported independently
> in a fast way.
>
Intresting, could C compilers in optimization mode to eliminate unused
evaluation path in nested if statements with constant conditional
expressions?


More information about the cython-devel mailing list