[SciPy-user] Re: Help on performance of signal.convolve

Tue Jan 18 17:56:09 EST 2005

Yichun Wei wrote:

> Hi Travis,
>
> Travis Oliphant wrote:
>
>> So, something like the following untested code should get you started:
>>
>> def convolvefft(arr1,arr2):
>>        s1 = array(arr1.shape())
>>        s2 = array(arr2.shape())
>>        fftsize = s1 + s2 - 1
>>        # finds the closest power of 2 in each dimension (you may 
>> comment this out and compare speeds)
>>        fftsize= pow(2,ceil(log2(fftsize))) 
>
>
> I tried for larger kernels. This does matters!
>
>>
>>        ARR1 = fftn(arr1,fftsize)
>>        ARR2 = fftn(arr2,fftsize)
>>              RES = ifftn(ARR1*ARR2)        #RES=RES[validpart]  # I'm 
>> not sure how to get the correct part --- first try would be to just 
>> truncate to the shape you wanted
>>        return RES  
>
>
> Using this code a convolution of (16,16,40) kernel with (16,16,1800) 
> signal takes 5s to solve on my 1.8G P-IV CPU.
>
> Because I am only interested in the "valid" convolution. What could be 
> used to speed this up a bit more? I am really in need of speed for I 
> have to do this convolution lots of times.

You can also use numbers that factor easily into powers of 2, 3, or 5 to 
get speed in fftpack.  You could try using djbfft (faster fft's).  If 
you install it in usual places, scipy installation should pick it up and 
use it for the fft's. You may also get some speed up by reusing memory:

RES = fftn(arr1,fftsize)
RES *= fftn(arr2,fftsize)
RES = ifftn(ARR1)

The valid part is the going to be the center abs(s2-s1) + 1 of RES.  
Unfortunately, I don't know of a fast way to just evaluate the middle 
portion of the fft.  So, any speed ups will be in reducing memory 
creation and element by element multiplication (you may get up to a 2x 
speed up by using weave to do the inplace multiplication). 

-Travis