[Cython] `cdef inline` and typed memory views

Dimitri Tcaciuc dtcaciuc at gmail.com
Sat Apr 21 21:17:52 CEST 2012


Hey everyone,

Congratulations on shipping 0.16! I think I found a problem which
seems pretty straight forward. Say I want to factor out inner part of
some N^2 loops over a flow array, I write something like

  cdef inline float _inner(size_t i, size_t j, float[:] x):
     cdef float d = x[i] - x[j]
     return sqrtf(d * d)

In 0.16, this actually compiles (as opposed to 0.15 with ndarray) and
function is declared as inline, which is great. However, the
memoryview structure is passed by value:

  static CYTHON_INLINE float __pyx_f_3foo__inner(size_t __pyx_v_i,
size_t __pyx_v_j, __Pyx_memviewslice __pyx_v_x) {
     ...

This seems to hinder compiler's (in my case, GCC 4.3.4) ability to
perform efficient inlining (although function does in fact get
inlined). If I manually inline that distance calculation, I get 3x
speedup. (in my case 0.324020147324 vs 1.43209195137 seconds for 10k
elements). When I manually modified generated .c file to pass memory
view slice by pointer, slowdown was eliminated completely.

On a somewhat relevant node, have you considered enabling Issues page on Github?


Thanks!


Dimitri.


More information about the cython-devel mailing list