Scipy does not call the MKL convolution function, so that isn't surprising. 

I've had good success with writing my own Cython wrapper around the Intel IPP convolution functions.  

On Sunday, July 13, 2014, Sai Rajeshwar <rajsai24@gmail.com> wrote:
hi , thanks for suggestions

  actually iam running my code on stampede tacc. where numpy,scipy are built against mkl libraries for optimal performance.. observations are as follows
--------------------------------------------------

 1) setting different OMP_NUM_THREADS to different values didnot change the runtimes
2)the code took same time as it took on mac pro with accelerated framework for blas and lapack..

so is mkl not being helpful, or its not getting configured to use multithreads

--------------------------
the statements taking lot fo time are like folllows
--------------------

1)  for i in xrange(conv_out_shape[1]):
            conv_out[0][i]=scipy.signal.convolve(self.input[0][i%self.image_shape[1]],numpy.rot90(self.W[0][i/self.image_shape[1]],2),mode='valid')



2)for i in xrange(pooled_shape[1]):
            for j in xrange(pooled_shape[2]):
                for k in xrange(pooled_shape[3]):
                    for l in xrange(pooled_shape[4]):
                        pooled[0][i][j][k][l]=math.tanh((numpy.sum(conv_out[0][i][j][k*3][l*3:(l+1)*3])+numpy.sum(conv_out[0][i][j][k*3+1][l*3:(l+1)*3])+numpy.sum(conv_out[0][i][j][k*3+2][l*3:(l+1)*3]))/9.0+b[i][j])


thanks

with regards..

M. Sai Rajeswar
M-tech  Computer Technology
IIT Delhi
----------------------------------Cogito Ergo Sum---------


On Fri, Jul 11, 2014 at 6:02 PM, Derek Homeier <derek@astro.physik.uni-goettingen.de> wrote:
On 10 Jul 2014, at 05:19 pm, Ashwin Srinath <ashwinsrnth@gmail.com> wrote:

> I'm no expert, so I'll just share a few links to start this discussion. You definitely want to look at Cython if you're computing with NumPy arrays. If you're familiar with the MPI programming model, you want to check out mpi4py. If you have NVIDIA GPUs that you'd like to take advantage of, check out PyCUDA.
>
> Thanks,
> Ashwin
>
>
> On Thu, Jul 10, 2014 at 6:08 AM, Sai Rajeshwar <rajsai24@gmail.com> wrote:
> hi all,
>
>    im trying to optimise a python code takes huge amount of time on scipy functions such as scipy.signa.conv. Following are some of my queries regarding the same.. It would be great to hear from you..  thanks..
> ----------------------------------------------------
>   1) Can Scipy take advantage of multi-cores.. if so how
> 2)what are ways we can improve the performance of scipy/numpy functions eg: using openmp, mpi etc
> 3)If scipy internally use blas/mkl libraries can we enable parallelism through these?
>
If your operations are using the BLAS functions a lot, you get SMP parallelisation for very cheap by
linking to the multithreaded MKL or ACML versions and setting OMP_NUM_THREADS/MKL_NUM_THREADS
to the no. of available cores.

Cheers,
                                                        Derek

_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev