[SciPy-User] fast small matrix multiplication with cython?
josef.pktd at gmail.com
josef.pktd at gmail.com
Mon Dec 6 19:33:13 EST 2010
On Mon, Dec 6, 2010 at 5:34 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> I'm wondering if anyone might have a look at my cython code that does
> matrix multiplication and see where I can speed it up or offer some
> pointers/reading. I'm new to Cython and my knowledge of C is pretty
> basic based on trial and (mostly) error, so I am sure the code is
> still very naive.
>
> import numpy as np
> from matmult import dotAB, multAB
>
> A = np.array([[ 1., 3., 4.],
> [ 5., 6., 3.]])
> B = A.T.copy()
>
> timeit dotAB(A,B)
> # 1 loops, best of 3: 826 ms per loop
>
> timeit multAB(A,B)
> # 1 loops, best of 3: 1.16 s per loop
>
> As you can see my multAB results in a negative speedup of about .75.
>
> I compile the cython code with
>
> cython -a matmult.pyx
> gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing
> -I/usr/include/python2.6
> -I/usr/local/lib/python2.6/dist-packages/numpy/core/include/ -o
> matmult.so matmult.c
>
> Cython code is attached and inlined below.
>
> Profile is here (some of which I don't understand why there are
> bottlenecks) http://eagle1.american.edu/~js2796a/matmult/matmult.html
> -----------------------------------------------------------
>
> from numpy cimport float64_t, ndarray, NPY_DOUBLE, npy_intp
> cimport cython
> from numpy import dot
>
> ctypedef float64_t DOUBLE
>
> cdef extern from "numpy/arrayobject.h":
> cdef void import_array()
> cdef object PyArray_SimpleNew(int nd, npy_intp *dims, int typenum)
>
> import_array()
>
> @cython.boundscheck(False)
> @cython.wraparound(False)
> cdef inline object matmult(ndarray[DOUBLE, ndim=2, mode='c'] A,
> ndarray[DOUBLE, ndim=2, mode='c'] B):
> cdef int lda = A.shape[0]
> cdef int n = B.shape[1]
> cdef npy_intp *dims = [lda, n]
> cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE)
> cdef int i,j,k
> cdef double s
> for i in xrange(lda):
> for j in xrange(n):
> s = 0
> for k in xrange(A.shape[1]):
> s += A[i,k] * B[k,j]
> out[i,j] = s
> return out
>
> def multAB(ndarray[DOUBLE, ndim=2] A, ndarray[DOUBLE, ndim=2] B):
> for i in xrange(1000000):
> C = matmult(A,B)
> return C
Does this generate c code, since it's not a cdef ? (I haven't updated
cython in a while.)
I guess you would want to have the entire loop in c.
Josef
>
> def dotAB(ndarray[DOUBLE, ndim=2] A, ndarray[DOUBLE, ndim=2] B):
> for i in xrange(1000000):
> C = dot(A,B)
> return C
>
> Skipper
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
More information about the SciPy-User
mailing list