[Cython] question on Get/SetItemInt*() utility functions

Stefan Behnel stefan_ml at behnel.de
Sat Nov 9 07:17:38 CET 2013

Having thought about this come more:

Stefan Behnel, 08.11.2013 15:16:
> the Get/SetItemInt helper macros usually look something like this:
> """
> #define __Pyx_GetItemInt(o, i, size, to_py_func, is_list, wraparound,
> boundscheck) \
>     (((size) <= sizeof(Py_ssize_t)) ? \

I noticed that there is code in ExprNodes.py that increases the "size"
value by one if the type is not signed, so this is currently a safe check.
However, we can do better than that.

>     __Pyx_GetItemInt_Fast(o, i, is_list, wraparound, boundscheck) : \
>     __Pyx_GetItemInt_Generic(o, to_py_func(i)))
> """
> e.g. here:
> https://github.com/cython/cython/blob/59c3408c054bbbb7334ccca7d76bed93f26057e9/Cython/Utility/ObjectHandling.c#L248
> https://github.com/cython/cython/blob/59c3408c054bbbb7334ccca7d76bed93f26057e9/Cython/Utility/StringTools.c#L318
> The "generic" part will almost never be called, except when the index
> *type* is really huge. However, if that happens, there's a rather large
> impact due to object creation, even if the value in question is tiny.
> Is there a way to safely extend this compile time size check with checks
> that compare the actual runtime value with PY_SSIZE_T_MIN and PY_SSIZE_T_MAX ?

Ok, so what are the cases we have here?

1) sizeof(type_of_i) < sizeof(Py_ssize_t)
- no problem (note that this includes char, the only integer type of
undefined signedness)

2) sizeof(type_of_i) > sizeof(Py_ssize_t)
- A comparison "i <= PY_SSIZE_T_MAX" will promote the rhs to the type of i,
which makes it safe.
- A comparison "i >= PY_SSIZE_T_MIN" is only safe if i is signed,
otherwise, it is not needed.
-> Cython knows at compile time if the index type is unsigned and thus can
pass a flag into the C macro, i.e. this should be safe:

 (i <= PY_SSIZE_T_MAX)  &&  (!signed || i >= PY_SSIZE_T_MIN)

3) sizeof(type_of_i) == sizeof(Py_ssize_t)
- If i is signed, both types must be identical (both are integers), so this
is safe.
- If i is unsigned, comparing i <= PY_SSIZE_T_MAX will promote the rhs to
the type of i and is thus safe.
-> this should be safe:

 signed || (i <= PY_SSIZE_T_MAX)

Putting it all together, we can use the fast path in these cases:

    (sizeof(type_of_i) < sizeof(Py_ssize_t))  ||

    (sizeof(type_of_i) > sizeof(Py_ssize_t) &&
          (i <= (type_of_i)PY_SSIZE_T_MAX)  &&
          (!signed || i >= (type_of_i)PY_SSIZE_T_MIN))  ||

    (sizeof(type_of_i) == sizeof(Py_ssize_t) &&
          (signed || (i <= (type_of_i)PY_SSIZE_T_MAX)))

The casts I added should keep the C compiler from complaining about unsafe

Then, for builtin container types, the fallback path already knows that the
index is out of bounds and can always raise an IndexError. For unknown
types, the existing "Generic" fallback function can be used.

Does the above appear correct?


More information about the cython-devel mailing list