[Numpy-discussion] Vectorize or rewrite function to work with array inputs?

Sturla Molden sturla at molden.no
Tue Feb 1 16:10:20 EST 2011


Den 01.02.2011 20:50, skrev John Salvatier:
> I am curious: why you recommend against this? Using the C-API through 
> cython seems more attractive than using the Cython-specific numpy 
> features since they need a specific number of dimensions (and I don't 
> think they broadcast) and don't seem that much easier. Are there 
> hidden disadvantages to using the C-API? Have I misunderstood something?

There is one more thing which should be mentioned:

If the algorithm needs to pass an ndarray to a function, this might 
cause excessive reference counting and boxing/unboxing with Cython. It 
gets even worse if we want to pass a subarray view of an ndarray, for 
which Cython will create a new array object. Cython will only play 
nicely with NumPy arrays if there are no function calls. If there are 
function calls, we must give up ndarrays and use C pointers directly. 
Users who don't know the internals of Cython might think that using an 
ndarray as a function argument is a good idea, or even use slicing to 
create a view of a subarray. Cython does not forbid this, but it hurts 
performance immensely. In the NumPy C API we can pass a pointer to a 
PyArrayObject, adding no overhead beyond the function call. But 
subarrays are as primitive as in Cython. This makes anything but trivial 
"computational kernels" painful in Cython or the NumPy C API.

Personally I avoid both Cython and the NumPy C API for this reason. 
There is a very nive language called Fortran 95, for which the compiler 
knows how to work with arrays. For those that don't use Fortran 95, 
there are libraries for C++ such as Blitz++.


Sturla






More information about the NumPy-Discussion mailing list