I have been looking at moving some of my bottleneck functions to fortran with f2py. To get started I tried some simple things, and was surprised they performend so much better than the number builtins - which I assumed would be c and would be quite fast.
On my Macbook pro laptop (Intel core 2 duo) I got the following results. Numpy is built with xcode gcc 4.0.1 and gfortran is 4.2.3 - fortran code for shuffle and bincount below: In : x = np.random.random_integers(0,1023,1000000).astype(int) In : import ftest In : timeit np.bincount(x) 100 loops, best of 3: 3.97 ms per loop In : timeit ftest.bincount(x,1024) 1000 loops, best of 3: 1.15 ms per loop In : timeit np.random.shuffle(x) 1 loops, best of 3: 605 ms per loop In : timeit ftest.shuffle(x) 10 loops, best of 3: 139 ms per loop
So fortran was about 4 times faster for these loops - similarly faster than cython as well. So I was really happy as these are two of my biggest bottlenecks, but when I moved a linux workstation I got different results. Here with gcc/gfortran 4.3.3 : In : x = np.random.random_integers(0,1023,1000000).astype(int) In : timeit np.bincount(x) 100 loops, best of 3: 8.18 ms per loop In : timeit ftest.bincount(x,1024) 100 loops, best of 3: 8.25 ms per loop In : In : timeit np.random.shuffle(x) 1 loops, best of 3: 379 ms per loop In : timeit ftest.shuffle(x) 10 loops, best of 3: 172 ms per loop
So shuffle is a bit faster, but bincount is now the same as fortran. The only thing I can think is that it is due to much better performance of the more recent c compiler. I think this would also explain why f2py extension was performing so much better than cython on the mac.
So my question is - is there a way to build numpy with a more recent compiler on leopard? (I guess I could upgrade to snow leopard now) - Could I make the numpy install use gcc-4.2 from xcode or would it break stuff? Could I use gcc 4.3.3 from macports? It would be great to get a 4x speed up on all numpy c loops! (already just these two functions I use a lot would make a big difference).
Forgot to include the fortran code used:
jm-g26b101:fortran robince$ cat test.f95 subroutine bincount (x,c,n,m) implicit none integer, intent(in) :: n,m integer, dimension(0:n-1), intent(in) :: x integer, dimension(0:m-1), intent(out) :: c integer :: i
c = 0 do i = 0, n-1 c(x(i)) = c(x(i)) + 1 end do end
subroutine shuffle (x,s,n) implicit none integer, intent(in) :: n integer, dimension(n), intent(in) :: x integer, dimension(n), intent(out) :: s integer :: i,randpos,temp real :: r
! copy input s = x call init_random_seed() ! knuth shuffle from http://rosettacode.org/wiki/Knuth_shuffle#Fortran do i = n, 2, -1 call random_number(r) randpos = int(r * i) + 1 temp = s(randpos) s(randpos) = s(i) s(i) = temp end do end
subroutine init_random_seed() ! init_random_seed from gfortran documentation integer :: i, n, clock integer, dimension(:), allocatable :: seed
call random_seed(size = n) allocate(seed(n))
seed = clock + 37 * (/ (i - 1, i = 1, n) /) call random_seed(put = seed)
deallocate(seed) end subroutine