On Mon, Jan 3, 2011 at 2:46 PM, David Cournapeau <cournape@gmail.com> wrote:
Hi,
I finally took the time to clean up my code to speed up prime-size FFT (which use a O(N^2) algo in both numpy and scipy). The code is there: https://github.com/cournape/numpy/tree/bluestein (most of the code is tests, because numpy.fft had almost none).
Bottom line: it is used only for prime numbers, and is faster than the
current code for complex transforms > 500. Because of python + inherent bluestein overhead, this is mostly useful for "long" fft (where the speed up is significant - already 100x speed up for prime size ~ 50000).
Very nice, works like a charm for me!
Several comments: - the overhead is pretty significant (on my machine, bluestein transfrom is slower for prime size < 500) - it could be used as such for real transforms, but the overhead would be even more significant (there is no bluestein transform for real transforms, so one needs to re-rexpress real transforms in term of complex ones, multiplying the overhead by 2x). There are several alternatives to make things faster (Rader-like transform, as used by fftw), but I think this would be quite hard to do in python without significant slowdown, because the code cannot be vectorized. - one could also decide to provide a chirp-z transform, of which Bluestein transform is a special case. Maybe this is more adapted to scipy ?
This is just terminology, but according to Wikipedia the Bluestein transform is the chirp-z transform, which is a special case of the z-transform. Is that what you meant? A z-transform may also be useful for digital filter design and other applications, scipy seems like the right place for it.
- more generic code will require a few simple (but not trivial) arithmetic-like functions (find prime factors, find generator of Z/nZ groups with n prime, etc...). Where should I put those ?
I'm guessing you are talking about code that allows you to use the Bluestein algorithm also for non-prime sizes where it makes sense, for example to speed up the second case of this:
In [24]: x = np.random.random(5879) # a large prime In [25]: %timeit np.fft.fft(x) 100 loops, best of 3: 8.65 ms per loop In [26]: x = np.random.random(5879*2) # Bluestein not used In [27]: %timeit np.fft.fft(x) 1 loops, best of 3: 241 ms per loop Probably just keep it in fft/helper.py is it's not too much code? Cheers, Ralf