curious FFFT timing

I have an application that needs to perform close 1/2 million FFT's (actually inverse FFTs) I was stymied by the time it was taking so I created a simple benchmark program to measure raw fft speed - hoping to extrapolate and see if the FFT algorithm was the time consuming portion of the algorithm. The result I obtained surprised me Inverse FFT ( 4096) Number of seconds 5.608000 Time per FFT 0.005608 Forward FFT ( 4096) Number of seconds 3.164000 Time per FFT 0.003164 Inverse FFT ( 4095) Number of seconds 8.222000 Time per FFT 0.008222 Forward FFT ( 4095) Number of seconds 5.468000 Time per FFT 0.005468 Inverse FFT ( 8192) Number of seconds 12.578000 Time per FFT 0.012578 Forward FFT ( 8192) Number of seconds 7.551000 Time per FFT 0.007551 Inverse FFTs were taking almost twice as long as forward FFTs ! Bottom line I found that in the inverse code the multiplication of 1/N at the end was the culprit. When I removed the /n from the line return _raw_fft(a, n, axis, fftpack.cffti, fftpack.cfftb, _fft_cache)/n the code timing was more along the line of what was expected. Any one want to speculate on the timing difference for the 4095 vice 4096 long transforms ? Since 4095 is not a power of two I would have expected a greater time difference (DFT vice FFT) N*N / N*log2(N) = N/log2(N) = 4096/12 = 341.33 which is >> than the 1.72 seen above Cheers Eric ------------------------------------ Code Snippet --------------------------------------------- import time from Numeric import * from FFT import * def f_fft(number,length): a = arange(float(length)) start_time = time.time() for i in xrange(number): b = fft(a) durat = time.time() - start_time durat_per = durat / number print "Forward FFT (%10d) Number of seconds %12.6f Time per FFT %12.6f" % (length,durat,durat_per) def i_fft(number,length): a = arange(float(length)) start_time = time.time() for i in xrange(number): b = inverse_fft(a) durat = time.time() - start_time durat_per = durat / number print "Inverse FFT (%10d) Number of seconds %12.6f Time per FFT %12.6f" % (length,durat,durat_per) i_fft(1000,4096) f_fft(1000,4096) i_fft(1000,4095) f_fft(1000,4095) i_fft(1000,8192) f_fft(1000,8192)

Hi Eric,
Any one want to speculate on the timing difference for the 4095 vice 4096 long transforms ? Since 4095 is not a power of two I would have expected a greater time difference (DFT vice FFT)
You are correct in that 4095 is not a power of two, but is is the product of only small prime factors: 3 * 3 * 5 * 7 * 13 = 4095 Since the FFT code implements a N*log_2(N) algorithm for numbers that contain only small prime factors, the difference in times is rather small. FFTs that have lengths that are powers-of-two tend to be more efficient in general (since the decimation routines are cleaner for this case). If you test with 4097 (17 * 241) it will be quite a bit slower I'd guess... Scott -- Scott M. Ransom Address: Harvard-Smithsonian CfA Phone: (617) 495-4142 60 Garden St. MS 10 email: ransom@cfa.harvard.edu Cambridge, MA 02138 GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989

Scott, Right you are. My brain fried -- I forgot about the other "small prime" stuff Inverse FFT ( 4096) Number of seconds 5.508000 Time per FFT 0.005508 Forward FFT ( 4096) Number of seconds 3.225000 Time per FFT 0.003225 Inverse FFT ( 4097) Number of seconds 29.723000 Time per FFT 0.029723 Forward FFT ( 4097) Number of seconds 26.788000 Time per FFT 0.026788 Cheers Eric ----- Original Message ----- From: "Scott Ransom" <ransom@cfa.harvard.edu> To: "Eric Hagemann" <ehagemann@home.com> Cc: <Numpy-discussion@lists.sourceforge.net> Sent: Tuesday, April 03, 2001 9:05 PM Subject: Re: [Numpy-discussion] curious FFFT timing
Hi Eric,
Any one want to speculate on the timing difference for the 4095 vice 4096 long transforms ? Since 4095 is not a power of two I would have expected a greater time difference (DFT vice FFT)
You are correct in that 4095 is not a power of two, but is is the product of only small prime factors: 3 * 3 * 5 * 7 * 13 = 4095
Since the FFT code implements a N*log_2(N) algorithm for numbers that contain only small prime factors, the difference in times is rather small. FFTs that have lengths that are powers-of-two tend to be more efficient in general (since the decimation routines are cleaner for this case).
If you test with 4097 (17 * 241) it will be quite a bit slower I'd guess...
Scott
-- Scott M. Ransom Address: Harvard-Smithsonian CfA Phone: (617) 495-4142 60 Garden St. MS 10 email: ransom@cfa.harvard.edu Cambridge, MA 02138 GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net http://lists.sourceforge.net/lists/listinfo/numpy-discussion
participants (2)
-
Eric Hagemann
-
Scott Ransom