<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Dec 11, 2014 at 10:41 AM, Todd <span dir="ltr"><<a href="mailto:toddrjen@gmail.com" target="_blank">toddrjen@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith <span dir="ltr"><<a href="mailto:njs@pobox.com" target="_blank">njs@pobox.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><p dir="ltr">On 28 Oct 2014 04:07, "Matthew Brett" <<a href="mailto:matthew.brett@gmail.com" target="_blank">matthew.brett@gmail.com</a>> wrote:<br>

><br>

> Hi,<br>

><br>

> On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden <<a href="mailto:sturla.molden@gmail.com" target="_blank">sturla.molden@gmail.com</a>> wrote:<br>

> > Sturla Molden <<a href="mailto:sturla.molden@gmail.com" target="_blank">sturla.molden@gmail.com</a>> wrote:<br>

> ><br>

> >> If we really need a<br>

> >> kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or<br>

> >> Apple's Accelerate Framework,<br>

> ><br>

> > I should perhaps also mention FFTS here, which claim to be faster than FFTW<br>

> > and has a BSD licence:<br>

> ><br>

> > <a href="http://anthonix.com/ffts/index.html" target="_blank">http://anthonix.com/ffts/index.html</a><br>

><br>

> Nice.  And a funny New Zealand name too.<br>

><br>

> Is this an option for us?  Aren't we a little behind the performance<br>

> curve on FFT after we lost FFTW?</p>

<p dir="ltr">It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:</p>

<p dir="ltr">- seems to have a hard requirement for a processor supporting SSE, AVX, or NEON. No fallback for old CPUs or other architectures. (I'm not even sure whether it has x86-32 support.)</p>

<p dir="ltr">-  no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time decision</p>

<p dir="ltr">- not sure if it can handle non-power-of-two problems at all, or at all efficiently. (FFTPACK isn't great here either but major regressions would be bad.)</p>

<p dir="ltr">- not sure if it supports all the modes we care about (e.g. rfft)</p>

<p dir="ltr">This stuff is all probably solveable though, so if someone has a hankering to make numpy (or scipy) fft dramatically faster then you should get in touch with the author and see what they think.</p><span><font color="#888888">

<p dir="ltr">-n</p></font></span></blockquote><div><br></div></div></div><div>I recently became aware of another C-library for doing FFTs (and other things):<br><br><a href="https://github.com/arrayfire/arrayfire" target="_blank">https://github.com/arrayfire/arrayfire</a><br><br></div><div>They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy).  It used to be proprietary but now it is under a BSD-3-Clause license.  It seems it supports non-power-of-2 FFT operations as well (although those are slower).  I don't know much beyond that, but it is probably worth looking in <br></div></div></div></div></blockquote><div><br><br><div>AFAICT the cpu backend is a FFTW wrapper.<br><br></div>Eric <br></div></div><br></div></div>