[SciPy-User] Single precision FFT insufficiently accurate.

Mon Jun 28 02:05:15 EDT 2010

On 28 June 2010 00:40, David Cournapeau <cournape at gmail.com> wrote:
> On Mon, Jun 28, 2010 at 12:16 PM, Anne Archibald
> <aarchiba at physics.mcgill.ca> wrote:
>
>>
>> I think falling back to double in this case is perfectly acceptable -
>> after all, any user of the FFT in general has to know that the
>> behaviour is severely data-dependent. In fact, since our FFT for those
>> sizes seems to be O(n**2), they will almost certainly find that speed
>> impels them to switch long before memory becomes an issue: the
>> smallest array where I can imagine a user caring about the usage of a
>> remporary double array is in the tens of millions of elements
>
> Or if you run many 1d fft on a 2d array - typical example is in audio
> processing where each row would be a window of a few hundred samples,
> and you have as many rows as you can memory-wise.

I guess the question here is where we do the conversion - of course it
is easiest to do that outside all the FFT loops. But it seems like
that iteration over all-but-one dimension of the array will often need
to copy the data anyway to get a contiguous 1d array; I'd say that's
the natural place to do upconverstion when necessary. (Though, does
FFTPACK provide strided/iterated FFTs? My only compiled-language
experience is with FFTW, which does.)

Failing that, I still think we're better providing single-precision
FFTs for easy factorizations than for none at all; my basic point was
that the FFT's behaviour already depends very strongly on the size of
the input array, so this is not a new form of "data dependence".

>> That said, even with FFTW3, which is pretty good about using the best
>> algorithm for your particular case, it often pays to pad rather than
>> use an awkward size (though the best padding is not necessarily
>> power-of-two, according to my time trials):
>> http://lighthouseinthesky.blogspot.com/2010/03/flops-and-fft.html
>> So weird-size FFTs don't completely take the burden of padding off the
>> user
>
> Of, definitely. Any FFT user should know that power of two should be
> used whenever possible/feasable. But generally, if you care about
> memory, padding as a significant cost in some cases (like the one I
> mentioned above).

This fact, which I also "knew", turns out to be false. (For FFTW;
can't speak for FFTPACK.)

The performance numbers in the blog post I linked to show that padding
to the next larger power of two is substantially slower than padding
to a number with a more complex factorization, though less padding and
more complexity slows it back down a little. This is solely for the
FFT, and it's for quite a large FFT. Any subsequent processing will
also of course be affected by the size you pad it to.

Anyway, my point is: data size dependence is an unavoidable curse of
the FFT, and the "right" size to pad to is often not at all obvious.

Anne