[SciPy-User] fftconvolve performance depends on padding to a power of two
Andrew York
Andrew.G.York+scipy at gmail.com
Mon May 27 19:11:36 EDT 2013
I'm using scipy.signal.fftconvolve in a performance-sensitive context, and
I've found that its default behavior (padding array size to the next power
of two) does not always give the best speed, and can consume a lot more
memory. I made a modified version that lets the user choose whether or not
to pad to the nearest power of two, and in some contexts, it goes faster by
a reasonable amount:
Shape of input data: (10L, 100L, 100L)
Time elapsed with 2**n padding: 0.144182774132
Time elapsed without 2**n padding: 0.112595033085
Shape of input data: (10L, 200L, 200L)
Time elapsed with 2**n padding: 0.636056829348
Time elapsed without 2**n padding: 0.713067174562
Shape of input data: (10L, 300L, 300L)
Time elapsed with 2**n padding: 3.10080130933
Time elapsed without 2**n padding: 1.0824417215
Assuming I'm not doing something really dumb, this might be a useful option
to expose to the user. Here's the modified code, very similar to the
existing fftconvolve code:
import numpy as np
from numpy import array, product
from scipy.fftpack import fftn, ifftn
from scipy.signal.signaltools import _centered
def fftconvolve(in1, in2, mode="full", pad_to_power_of_two=True):
"""Convolve two N-dimensional arrays using FFT. See convolve.
"""
s1 = array(in1.shape)
s2 = array(in2.shape)
complex_result = (np.issubdtype(in1.dtype, np.complex) or
np.issubdtype(in2.dtype, np.complex))
size = s1 + s2 - 1
if pad_to_power_of_two:
# Use 2**n-sized FFT; it might improve performance
fsize = 2 ** np.ceil(np.log2(size))
else:
# Padding to a power of two might degrade performance, too
fsize = size
IN1 = fftn(in1, fsize)
IN1 *= fftn(in2, fsize)
fslice = tuple([slice(0, int(sz)) for sz in size])
ret = ifftn(IN1)[fslice].copy()
del IN1
if not complex_result:
ret = ret.real
if mode == "full":
return ret
elif mode == "same":
if product(s1, axis=0) > product(s2, axis=0):
osize = s1
else:
osize = s2
return _centered(ret, osize)
elif mode == "valid":
return _centered(ret, abs(s2 - s1) + 1)
if __name__ == '__main__':
import time
for sz in (100, 200, 300):
a = np.zeros((10, sz, sz), dtype=np.float64)
b = np.zeros((10, 20, 20), dtype=np.float64)
print "Shape of input data:", a.shape
start = time.clock()
fftconvolve(a, b, mode='same', pad_to_power_of_two=True)
end = time.clock()
print "Time elapsed with 2**n padding:", end - start
start = time.clock()
fftconvolve(a, b, mode='same', pad_to_power_of_two=False)
end = time.clock()
print "Time elapsed without 2**n padding:", end - start
print
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130527/920215cc/attachment.html>
More information about the SciPy-User
mailing list