[Numpy-discussion] extension questions: f2py and cython

Thu Oct 15 06:37:13 EDT 2009

Hi,

Sent this last week but checking the archives it appears not to have
got through. Hopefully this will work...

I am looking at moving some of my code to fortran to use with f2py. To
get started I used this simple example:

SUBROUTINE bincount (x,c,n,m)
     IMPLICIT NONE
     INTEGER, INTENT(IN) :: n,m
     INTEGER, DIMENSION(n), INTENT(IN) :: x
     INTEGER, DIMENSION(0:m-1), INTENT(OUT) :: c
     INTEGER :: i

     DO i = 1, n
       c(x(i)) = c(x(i)) + 1
     END DO
END

It performs well:
In [1]: x = np.random.random_integers(0,1023,1000000).astype(int)
In [4]: timeit test.bincount(x,1024)
1000 loops, best of 3: 1.16 ms per loop
In [5]: timeit np.bincount(x)
100 loops, best of 3: 4 ms per loop

I'm guessing most of the benefit comes from less checking + not having
to find the maximum value (which I pass as parameter m).

But I have some questions. It seems to work as is, but I don't set c
to zeros anywhere. Can I assume arrays created by f2py are zero? Is
this the recommended way to use f2py with arrays? (I initially tried
using assumed arrays with DIMENSION(:) but it couldn't get it to
work).
Also I'm quite new to fortran - what would be the advantages,
if any, of using a FORALL instead of DO in a situation like this?
I guess with 1D arrays it doesn't make a copy since ordering is not a
problem, but if it was 2D arrays am I right in thinking that if I
passed in a C order array it would automatically make a copy to F
order. What about the return - will I get a number array in F order,
or will it automatically be copied to C order? (I guess I will see but
I haven't got onto trying that yet).
What if I wanted to keep all the array creation in numpy - ie call it
as the fortran subroutine bincount(x,c) and have c modified in place?
Should I be using !f2py comments? I wasn't clear if these are needed -
it seems to work as is but could they give any improvement?

For comparison I tried the same thing in cython - after a couple of
iterations with not typing things properly I ended up with:

import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
def bincount(np.ndarray[np.int_t, ndim=1] x not None,int m):
    cdef int n = x.shape[0]
    cdef unsigned int i
    cdef np.ndarray[np.int_t, ndim=1] c = np.zeros(m,dtype=np.int)

    for i from 0 <= i < n:
        c[<unsigned int>x[i]] += 1

    return c

which now performs a bit better than np.bincount, but still
significantly slower than the fortran. Is this to be expected or am I
missing something in the cython?

In [14]: timeit ctest.bincount(x,1024)
100 loops, best of 3: 3.31 ms per loop

Cheers

Robin