ok luke. thanks can you throw some light on Cython wrapper for IPP convolution function.. how should i go about it ..to start with.. and bit of details would be helpful... thanks *with regards..* *M. Sai Rajeswar* *M-tech Computer Technology* *IIT Delhi----------------------------------Cogito Ergo Sum---------* On Sun, Jul 13, 2014 at 6:00 PM, Luke Pfister <luke.pfister@gmail.com> wrote:
Scipy does not call the MKL convolution function, so that isn't surprising.
I've had good success with writing my own Cython wrapper around the Intel IPP convolution functions.
On Sunday, July 13, 2014, Sai Rajeshwar <rajsai24@gmail.com> wrote:
hi , thanks for suggestions
actually iam running my code on stampede tacc. where numpy,scipy are built against mkl libraries for optimal performance.. observations are as follows --------------------------------------------------
1) setting different OMP_NUM_THREADS to different values didnot change the runtimes 2)the code took same time as it took on mac pro with accelerated framework for blas and lapack..
so is mkl not being helpful, or its not getting configured to use multithreads
-------------------------- the statements taking lot fo time are like folllows --------------------
1) for i in xrange(conv_out_shape[1]):
conv_out[0][i]=scipy.signal.convolve(self.input[0][i%self.image_shape[1]],numpy.rot90(self.W[0][i/self.image_shape[1]],2),mode='valid')
2)for i in xrange(pooled_shape[1]): for j in xrange(pooled_shape[2]): for k in xrange(pooled_shape[3]): for l in xrange(pooled_shape[4]):
pooled[0][i][j][k][l]=math.tanh((numpy.sum(conv_out[0][i][j][k*3][l*3:(l+1)*3])+numpy.sum(conv_out[0][i][j][k*3+1][l*3:(l+1)*3])+numpy.sum(conv_out[0][i][j][k*3+2][l*3:(l+1)*3]))/9.0+b[i][j])
thanks
*with regards..*
*M. Sai Rajeswar* *M-tech Computer Technology*
*IIT Delhi----------------------------------Cogito Ergo Sum---------*
On Fri, Jul 11, 2014 at 6:02 PM, Derek Homeier < derek@astro.physik.uni-goettingen.de> wrote:
On 10 Jul 2014, at 05:19 pm, Ashwin Srinath <ashwinsrnth@gmail.com> wrote:
I'm no expert, so I'll just share a few links to start this discussion. You definitely want to look at Cython if you're computing with NumPy arrays. If you're familiar with the MPI programming model, you want to check out mpi4py. If you have NVIDIA GPUs that you'd like to take advantage of, check out PyCUDA.
Thanks, Ashwin
On Thu, Jul 10, 2014 at 6:08 AM, Sai Rajeshwar <rajsai24@gmail.com> wrote: hi all,
im trying to optimise a python code takes huge amount of time on scipy functions such as scipy.signa.conv. Following are some of my queries regarding the same.. It would be great to hear from you.. thanks.. ---------------------------------------------------- 1) Can Scipy take advantage of multi-cores.. if so how 2)what are ways we can improve the performance of scipy/numpy functions eg: using openmp, mpi etc 3)If scipy internally use blas/mkl libraries can we enable parallelism through these?
If your operations are using the BLAS functions a lot, you get SMP parallelisation for very cheap by linking to the multithreaded MKL or ACML versions and setting OMP_NUM_THREADS/MKL_NUM_THREADS to the no. of available cores.
Cheers, Derek
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev