Prime size FFT: bluestein transform vs general chirp/z transform ?
Hi, I finally took the time to clean up my code to speed up prime-size FFT (which use a O(N^2) algo in both numpy and scipy). The code is there: https://github.com/cournape/numpy/tree/bluestein (most of the code is tests, because numpy.fft had almost none). Bottom line: it is used only for prime numbers, and is faster than the current code for complex transforms > 500. Because of python + inherent bluestein overhead, this is mostly useful for "long" fft (where the speed up is significant - already 100x speed up for prime size ~ 50000). Several comments: - the overhead is pretty significant (on my machine, bluestein transfrom is slower for prime size < 500) - it could be used as such for real transforms, but the overhead would be even more significant (there is no bluestein transform for real transforms, so one needs to re-rexpress real transforms in term of complex ones, multiplying the overhead by 2x). There are several alternatives to make things faster (Rader-like transform, as used by fftw), but I think this would be quite hard to do in python without significant slowdown, because the code cannot be vectorized. - one could also decide to provide a chirp-z transform, of which Bluestein transform is a special case. Maybe this is more adapted to scipy ? - more generic code will require a few simple (but not trivial) arithmetic-like functions (find prime factors, find generator of Z/nZ groups with n prime, etc...). Where should I put those ? cheers, David
On Mon, Jan 3, 2011 at 2:46 PM, David Cournapeau <cournape@gmail.com> wrote:
Hi,
I finally took the time to clean up my code to speed up prime-size FFT (which use a O(N^2) algo in both numpy and scipy). The code is there: https://github.com/cournape/numpy/tree/bluestein (most of the code is tests, because numpy.fft had almost none).
Bottom line: it is used only for prime numbers, and is faster than the
current code for complex transforms > 500. Because of python + inherent bluestein overhead, this is mostly useful for "long" fft (where the speed up is significant - already 100x speed up for prime size ~ 50000).
Very nice, works like a charm for me!
Several comments: - the overhead is pretty significant (on my machine, bluestein transfrom is slower for prime size < 500) - it could be used as such for real transforms, but the overhead would be even more significant (there is no bluestein transform for real transforms, so one needs to re-rexpress real transforms in term of complex ones, multiplying the overhead by 2x). There are several alternatives to make things faster (Rader-like transform, as used by fftw), but I think this would be quite hard to do in python without significant slowdown, because the code cannot be vectorized. - one could also decide to provide a chirp-z transform, of which Bluestein transform is a special case. Maybe this is more adapted to scipy ?
This is just terminology, but according to Wikipedia the Bluestein transform is the chirp-z transform, which is a special case of the z-transform. Is that what you meant? A z-transform may also be useful for digital filter design and other applications, scipy seems like the right place for it.
- more generic code will require a few simple (but not trivial) arithmetic-like functions (find prime factors, find generator of Z/nZ groups with n prime, etc...). Where should I put those ?
I'm guessing you are talking about code that allows you to use the Bluestein algorithm also for non-prime sizes where it makes sense, for example to speed up the second case of this:
In [24]: x = np.random.random(5879) # a large prime In [25]: %timeit np.fft.fft(x) 100 loops, best of 3: 8.65 ms per loop In [26]: x = np.random.random(5879*2) # Bluestein not used In [27]: %timeit np.fft.fft(x) 1 loops, best of 3: 241 ms per loop Probably just keep it in fft/helper.py is it's not too much code? Cheers, Ralf
I just took a look at http://www.katjaas.nl/chirpZ/chirpZ2.html I'm VERY interested in the zoom. Does the code https://github.com/cournape/numpy/tree/bluestein implement the zoom feature?
On Tue, Jan 18, 2011 at 3:12 AM, Neal Becker <ndbecker2@gmail.com> wrote:
I just took a look at
http://www.katjaas.nl/chirpZ/chirpZ2.html
I'm VERY interested in the zoom. Does the code
https://github.com/cournape/numpy/tree/bluestein
implement the zoom feature?
No - the current code is only an implementation optimization to deal with fft size which cannot be factorized in small factors. cheers, David
On Sun, Jan 16, 2011 at 2:22 PM, Ralf Gommers <ralf.gommers@googlemail.com> wrote:
On Mon, Jan 3, 2011 at 2:46 PM, David Cournapeau <cournape@gmail.com> wrote:
Hi,
I finally took the time to clean up my code to speed up prime-size FFT (which use a O(N^2) algo in both numpy and scipy). The code is there: https://github.com/cournape/numpy/tree/bluestein (most of the code is tests, because numpy.fft had almost none).
Bottom line: it is used only for prime numbers, and is faster than the current code for complex transforms > 500. Because of python + inherent bluestein overhead, this is mostly useful for "long" fft (where the speed up is significant - already 100x speed up for prime size ~ 50000).
Very nice, works like a charm for me!
Several comments: - the overhead is pretty significant (on my machine, bluestein transfrom is slower for prime size < 500) - it could be used as such for real transforms, but the overhead would be even more significant (there is no bluestein transform for real transforms, so one needs to re-rexpress real transforms in term of complex ones, multiplying the overhead by 2x). There are several alternatives to make things faster (Rader-like transform, as used by fftw), but I think this would be quite hard to do in python without significant slowdown, because the code cannot be vectorized. - one could also decide to provide a chirp-z transform, of which Bluestein transform is a special case. Maybe this is more adapted to scipy ?
This is just terminology, but according to Wikipedia the Bluestein transform is the chirp-z transform, which is a special case of the z-transform. Is that what you meant?
Right, I meant z-transform.
I'm guessing you are talking about code that allows you to use the Bluestein algorithm also for non-prime sizes where it makes sense, for example to speed up the second case of this:
In [24]: x = np.random.random(5879) # a large prime
This is for the real transform case, where the reindexing step requires to find a generator of Z/nZ. As for dealing with non prime sizes, I am not sure what to do: there is the speed issue, but the precision issue as well. I am currently doing some more thorough tests across various sizes to make sure bluestein transforms do not cause loss of precisions cheers, David
participants (3)
-
David Cournapeau
-
Neal Becker
-
Ralf Gommers