David M. Cooke wrote:
The problem is that the plan depends on the input arrays! Caching it won't help with Python, unless you can guarantee that the same arrays are passed to successive calls.
I know that, but it really make no sense to use fftw3 as it is used now... For moderate sizes, it is more than 10 times slower !
Getting around that will mean digging into the guru interface, I think (ugh).
I tried a dirty hack using the function fftw_execute_dft, which executes a given plan for different arrays, given they have the same properties. The problem is that because of the obfuscated way fftpack is coded right now, it is difficult to track what is going on; I have a small test program which calls zfft directly from the module _fftpack.so, running it under valgrind shows no problem... So there is something going on in fftpack I don't understand. The other obvious thing is to copy the content of the array into a cached buffer, computing the fft on it, and recopying the result. This is stupid, but it is better than the current situation I think. I implemented this solution, the speed is much better, and tests succeed. Do you know how I am supposed to build a patch (I am not familiar with patch...) David