
The cython function below returns a long int:
@cython.boundscheck(False) def mysum(np.ndarray[np.int64_t, ndim=1] a): "sum of 1d numpy array with dtype=np.int64." cdef Py_ssize_t i cdef int asize = a.shape[0] cdef np.int64_t asum = 0 for i in range(asize): asum += a[i] return asum
What's the best way to make it return a numpy long int, or whatever it is called, that has dtype, ndim, size, etc. class methods? The only thing I could come up with is changing the last line to
return np.array(asum)[()]
It works. And adds some overhead:
a = np.arange(10) timeit mysum(a)
10000000 loops, best of 3: 167 ns per loop
timeit mysum2(a)
1000000 loops, best of 3: 984 ns per loop
And for scale:
timeit np.sum(a)
100000 loops, best of 3: 3.3 us per loop
I'm new to cython. Did I miss any optimizations in the mysum function above?