On Fri, Aug 18, 2006 at 07:01:17PM +0900, David Cournapeau wrote:
David M. Cooke wrote:
On Fri, Aug 18, 2006 at 05:32:14PM +0900, David Cournapeau wrote:
Hi there,
I noticed recently that when using the fft module of scipy, it is much slower (5-10 folds) than numpy for complex inputs (only in the 1d case) when linking to fftw3. This problem is reported on the ticket #1 for scipy : http://projects.scipy.org/scipy/scipy/ticket/1
I am not sure, because the code is a bit difficult to read, but it looks like in the case of complex input + fftw3, the plan is always recomputed for each call to zfft (file:zfft.c), whereas in the real case or in the complexe case + fftw2, the function drfft(file:drfft.c), called from zrfft (file:zrfft.c) is calling a plan which is cached. I am trying to see how the caching is done, but I am not sure I will have the time to make it work for fftw3.
Well, for fftw3 it uses FFTW_ESTIMATE for the plan. So it does a cheap estimate of what it needs. Well, it depends what you mean by cheap. If compared to FFTW_MEASURE, yes. But compared to pre-computing the plan, and then doing multiple fftw, then it is not cheap. Computing the fftw is negligeable compared to computing the plan !
The only difference between cached and non cached is that the plan is computed again for each iteration in the non cached case (as done by scipy.fft now in the case of begin linked with fftw3):
The problem is that the plan depends on the input arrays! Caching it won't help with Python, unless you can guarantee that the same arrays are passed to successive calls. Getting around that will mean digging into the guru interface, I think (ugh). I'll have a clearer idea of what we can and can not do once I dig into fftw3. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca