On Thu, Dec 11, 2014 at 10:41 AM, Todd <toddrjen@gmail.com> wrote:
On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith <njs@pobox.com> wrote:
On 28 Oct 2014 04:07, "Matthew Brett" <matthew.brett@gmail.com> wrote:
Hi,
On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden <sturla.molden@gmail.com>
Sturla Molden <sturla.molden@gmail.com> wrote:
If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework,
I should perhaps also mention FFTS here, which claim to be faster
wrote: than FFTW
and has a BSD licence:
Nice. And a funny New Zealand name too.
Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW?
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:
- seems to have a hard requirement for a processor supporting SSE, AVX, or NEON. No fallback for old CPUs or other architectures. (I'm not even sure whether it has x86-32 support.)
- no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time decision
- not sure if it can handle non-power-of-two problems at all, or at all efficiently. (FFTPACK isn't great here either but major regressions would be bad.)
- not sure if it supports all the modes we care about (e.g. rfft)
This stuff is all probably solveable though, so if someone has a hankering to make numpy (or scipy) fft dramatically faster then you should get in touch with the author and see what they think.
-n
I recently became aware of another C-library for doing FFTs (and other things):
https://github.com/arrayfire/arrayfire
They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in
AFAICT the cpu backend is a FFTW wrapper. Eric