FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)
On 28 Oct 2014 04:07, "Matthew Brett" <matthew.brett@gmail.com> wrote:
Hi,
On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden <sturla.molden@gmail.com>
wrote:
Sturla Molden <sturla.molden@gmail.com> wrote:
If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework,
I should perhaps also mention FFTS here, which claim to be faster than FFTW and has a BSD licence:
Nice. And a funny New Zealand name too.
Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW?
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim: - seems to have a hard requirement for a processor supporting SSE, AVX, or NEON. No fallback for old CPUs or other architectures. (I'm not even sure whether it has x86-32 support.) - no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time decision - not sure if it can handle non-power-of-two problems at all, or at all efficiently. (FFTPACK isn't great here either but major regressions would be bad.) - not sure if it supports all the modes we care about (e.g. rfft) This stuff is all probably solveable though, so if someone has a hankering to make numpy (or scipy) fft dramatically faster then you should get in touch with the author and see what they think. -n
On Tue, 28 Oct 2014 04:28:37 +0000 Nathaniel Smith <njs@pobox.com> wrote:
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:
In my tests, numpy's FFTPACK isn't that bad considering * (virtually) no extra overhead for installation * (virtually) no plan creation time * not that slower for each transformation Because the plan creation was taking ages with FFTw, numpy's FFTPACK was often faster (overall) Cheers, -- Jérôme Kieffer tel +33 476 882 445
On Tue, Oct 28, 2014 at 1:32 AM, Jerome Kieffer <Jerome.Kieffer@esrf.fr> wrote:
On Tue, 28 Oct 2014 04:28:37 +0000 Nathaniel Smith <njs@pobox.com> wrote:
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:
In my tests, numpy's FFTPACK isn't that bad considering * (virtually) no extra overhead for installation * (virtually) no plan creation time * not that slower for each transformation
Because the plan creation was taking ages with FFTw, numpy's FFTPACK was often faster (overall)
Cheers,
Ondrej says that f90 fftpack (his mod) runs faster than fftw. The main thing missing from fftpack is the handling of transform sizes that are not products of 2,3,4,5. Chuck
On Tue, Oct 28, 2014 at 9:19 AM, Charles R Harris <charlesr.harris@gmail.com
wrote:
On Tue, Oct 28, 2014 at 1:32 AM, Jerome Kieffer <Jerome.Kieffer@esrf.fr> wrote:
On Tue, 28 Oct 2014 04:28:37 +0000 Nathaniel Smith <njs@pobox.com> wrote:
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:
In my tests, numpy's FFTPACK isn't that bad considering * (virtually) no extra overhead for installation * (virtually) no plan creation time * not that slower for each transformation
Because the plan creation was taking ages with FFTw, numpy's FFTPACK was often faster (overall)
Cheers,
Ondrej says that f90 fftpack (his mod) runs faster than fftw.
I would be interested to see the benchmarks for this. The real issue with fftw (besides the license) is the need for plan computation, which are expensive (but are not needed for each transform). Handling this in a way that is user friendly while tweakable for advanced users is not easy, and IMO more appropriate for a separate package. The main thing missing from fftpack is the handling of transform sizes that
are not products of 2,3,4,5.
Strickly speaking, it is handled, just not through an FFT (it goes back to the brute force O(N**2)). I made some experiments with the Bluestein transform to handle prime transforms on fftpack, but the precision seemed to be an issue. Maybe I should revive this work (if I still have it somewhere). David
On 28/10/14 09:41, David Cournapeau wrote:
The real issue with fftw (besides the license) is the need for plan computation, which are expensive (but are not needed for each transform). Handling this in a way that is user friendly while tweakable for advanced users is not easy, and IMO more appropriate for a separate package.
Just on this, I like to think I've largely solved the issue with: https://github.com/hgomersall/pyFFTW If you have suggestions on how it can be improved, I'm all ears (there are a few things in the pipeline, like creating FFTW objects for different types of transform more explicit, which is likely to be the main difference for the next major version). Cheers, Henry
David Cournapeau <cournape@gmail.com> wrote:
The real issue with fftw (besides the license) is the need for plan computation, which are expensive (but are not needed for each transform).
This is not a problem if you thell FFTW to guess a plan instead of making measurements. FFTPACK needs to set up a look-up table too.
I made some experiments with the Bluestein transform to handle prime transforms on fftpack, but the precision seemed to be an issue. Maybe I should revive this work (if I still have it somewhere).
You have it in a branch on Github. Sturla
If I may 'hyjack' the discussion back to the meta-point: should we be having this discussion on the numpy mailing list at all? Perhaps the 'batteries included' philosophy made sense in the early days of numpy; but given that there are several fft libraries with their own pros and cons, and that most numpy projects will use none of them at all, why should numpy bundle any of them? To have a scipy.linalg and scipy.fft makes sense to me, although import pyfftw or import pyFFTPACK would arguably be better still. Just as in the case of linear algebra, those different libraries represent meaningful differences, and if the user wants to paper over those differences with a named import they are always free to do so themselves, explicitly. To be sure, the maintenance of quality fft libraries should be part of the numpy/scipy-stack in some way or another. But I would argue that the core thing that numpy should do is ndarrays alone. On Tue, Oct 28, 2014 at 11:11 AM, Sturla Molden <sturla.molden@gmail.com> wrote:
David Cournapeau <cournape@gmail.com> wrote:
The real issue with fftw (besides the license) is the need for plan computation, which are expensive (but are not needed for each transform).
This is not a problem if you thell FFTW to guess a plan instead of making measurements. FFTPACK needs to set up a look-up table too.
I made some experiments with the Bluestein transform to handle prime transforms on fftpack, but the precision seemed to be an issue. Maybe I should revive this work (if I still have it somewhere).
You have it in a branch on Github.
Sturla
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 28 Oct 2014 14:48, "Eelco Hoogendoorn" <hoogendoorn.eelco@gmail.com> wrote:
If I may 'hyjack' the discussion back to the meta-point:
should we be having this discussion on the numpy mailing list at all?
Perhaps the 'batteries included' philosophy made sense in the early days of numpy; but given that there are several fft libraries with their own
To have a scipy.linalg and scipy.fft makes sense to me, although import
Of course we should. pros and cons, and that most numpy projects will use none of them at all, why should numpy bundle any of them? Certainly there's a place for fancy 3rd-party fft libraries. But fft is such a basic algorithm that it'd be silly to ask people who just need a quick one-off fft to go evaluate a bunch of third-party libraries. For many users, downloading one of these libraries will take longer than just doing their Fourier transform with an O(N**2) algorithm :-). And besides that there's tons of existing code that uses np.fft. So np.fft will continue to exist, and given that it exists we should make it as good as we can. pyfftw or import pyFFTPACK would arguably be better still. Just as in the case of linear algebra, those different libraries represent meaningful differences, and if the user wants to paper over those differences with a named import they are always free to do so themselves, explicitly. To be sure, the maintenance of quality fft libraries should be part of the numpy/scipy-stack in some way or another. But I would argue that the core thing that numpy should do is ndarrays alone. According to some sort of abstract project planning aesthetics, perhaps. But I don't see how fractionating numpy into lots of projects would provide any benefit for users. (If we split numpy into 10 subprojects then probably 7 of them would never release, because we barely have the engineering to do release management now.) CS courses often teach that more modular = more better. That's because they're desperate to stop newbies from creating balls of mush, though, not because it's the whole truth :-). It's always true that an organized codebase is better than a ball of mush, but abstraction barriers, decoupling, etc. have real and important costs, and this needs to be taken into account. (See e.g. the Torvalds/Tenenbaum debate.) And in any case, this ship sailed a long time ago. -n
Eelco Hoogendoorn <hoogendoorn.eelco@gmail.com> wrote:
Perhaps the 'batteries included' philosophy made sense in the early days of numpy; but given that there are several fft libraries with their own pros and cons, and that most numpy projects will use none of them at all, why should numpy bundle any of them?
Because sometimes we just need to compute a DFT, just like we sometimes need to compute a sine or an exponential. It does that job perfectly well. It is not always about speed. Just typing np.fft.fft(x) is convinient. Sturla
My point isn't about speed; its about the scope of numpy. typing np.fft.fft isn't more or less convenient than using some other symbol from the scientific python stack. Numerical algorithms should be part of the stack, for sure; but should they be part of numpy? I think its cleaner to have them in a separate package. Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'. On Tue, Oct 28, 2014 at 6:21 PM, Sturla Molden <sturla.molden@gmail.com> wrote:
Eelco Hoogendoorn <hoogendoorn.eelco@gmail.com> wrote:
Perhaps the 'batteries included' philosophy made sense in the early days of numpy; but given that there are several fft libraries with their own pros and cons, and that most numpy projects will use none of them at all, why should numpy bundle any of them?
Because sometimes we just need to compute a DFT, just like we sometimes need to compute a sine or an exponential. It does that job perfectly well. It is not always about speed. Just typing np.fft.fft(x) is convinient.
Sturla
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 29 October 2014 10:48, Eelco Hoogendoorn <hoogendoorn.eelco@gmail.com> wrote:
My point isn't about speed; its about the scope of numpy. typing np.fft.fft isn't more or less convenient than using some other symbol from the scientific python stack.
The problem is in distribution. For many users, installing a new library is not easy (computing cluster, company regulations...). And this assuming the alternative library is held to the same quality standards as Numpy. Another argument is that this should only be living in Scipy, that is, after all, quite standard; but it requires a FORTRAN compiler, that is quite a big dependency. /David.
On Wed, Oct 29, 2014 at 9:48 AM, Eelco Hoogendoorn < hoogendoorn.eelco@gmail.com> wrote:
My point isn't about speed; its about the scope of numpy. typing np.fft.fft isn't more or less convenient than using some other symbol from the scientific python stack.
Numerical algorithms should be part of the stack, for sure; but should they be part of numpy? I think its cleaner to have them in a separate package. Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'.
I would agree if it were not already there, but removing it (like Blas/Lapack) is out of the question for backward compatibility reason. Too much code depends on it. David
On Tue, Oct 28, 2014 at 6:21 PM, Sturla Molden <sturla.molden@gmail.com> wrote:
Eelco Hoogendoorn <hoogendoorn.eelco@gmail.com> wrote:
Perhaps the 'batteries included' philosophy made sense in the early days of numpy; but given that there are several fft libraries with their own pros and cons, and that most numpy projects will use none of them at all, why should numpy bundle any of them?
Because sometimes we just need to compute a DFT, just like we sometimes need to compute a sine or an exponential. It does that job perfectly well. It is not always about speed. Just typing np.fft.fft(x) is convinient.
Sturla
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'.
I agree with the above.
I would agree if it were not already there, but removing it (like Blas/Lapack) is out of the question for backward compatibility reason. Too much code depends on it.
And I definitely agree with that too. I think that numpy.fft should be left there in its current state (although perhaps as deprecated). Now scipy.fft should have a good generic algorithm as default, and easily allow for different implementations to be accessed through the same interface. Pierre-André On 10/29/2014 03:33 AM, David Cournapeau wrote:
On Wed, Oct 29, 2014 at 9:48 AM, Eelco Hoogendoorn <hoogendoorn.eelco@gmail.com <mailto:hoogendoorn.eelco@gmail.com>> wrote:
My point isn't about speed; its about the scope of numpy. typing np.fft.fft isn't more or less convenient than using some other symbol from the scientific python stack.
Numerical algorithms should be part of the stack, for sure; but should they be part of numpy? I think its cleaner to have them in a separate package. Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'.
I would agree if it were not already there, but removing it (like Blas/Lapack) is out of the question for backward compatibility reason. Too much code depends on it.
David
On Tue, Oct 28, 2014 at 6:21 PM, Sturla Molden <sturla.molden@gmail.com <mailto:sturla.molden@gmail.com>> wrote:
Eelco Hoogendoorn <hoogendoorn.eelco@gmail.com <mailto:hoogendoorn.eelco@gmail.com>> wrote:
> Perhaps the 'batteries included' philosophy made sense in the early days of > numpy; but given that there are several fft libraries with their own pros > and cons, and that most numpy projects will use none of them at all, why > should numpy bundle any of them?
Because sometimes we just need to compute a DFT, just like we sometimes need to compute a sine or an exponential. It does that job perfectly well. It is not always about speed. Just typing np.fft.fft(x) is convinient.
Sturla
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org> http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org> http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 29.10.2014 18:03, Pierre-Andre Noel wrote:
Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'.
I agree with the above.
Absolutely. I think the NumPy/Scipy interfaces are easy and convenient enough to serve as a front-end to different FFT codes.
I would agree if it were not already there, but removing it (like Blas/Lapack) is out of the question for backward compatibility reason. Too much code depends on it.
And I definitely agree with that too.
I think that numpy.fft should be left there in its current state (although perhaps as deprecated). Now scipy.fft should have a good generic algorithm as default, and easily allow for different implementations to be accessed through the same interface.
Definitely. My attempt at streamlining the use of pyfftw even further can be found here: https://github.com/aeberspaecher/transparent_pyfftw This small package does nothing more than to automatically load fftw wisdom on export and add a keyword that gives the number of threads to use to NumPy/Scipy style FFT calls. I think similar attempts could be made with other FFT libraries. The mission statement would be to map each library's interface to the simple and convenient SciPy/NumPy-style interface, and also to wisely choose default parameters (such as e.g. pyfftw's planner_effort). Also, I think it's obvious that a generic and easy-to-use implementation cannot deliver exactly the same performance as hand-tuned code, but anyway I see plenty room for improvement for SciPy/NumPy's FFT modules. Alex
On 29/10/14 18:23, Alexander Eberspächer wrote:
Definitely. My attempt at streamlining the use of pyfftw even further can be found here:
There could be an argument that this sort of capability should be added to the pyfftw package, as a package level config. Something like: import pyfftw pyfftw.default_threads = 4 import pyfftw.interfaces.numpy_fft as fft The wisdom code can be added at the package import level too, but that doesn't need anything more. Cheers, Henry
On 29.10.2014 19:40, Henry Gomersall wrote:
There could be an argument that this sort of capability should be added to the pyfftw package, as a package level config.
Something like:
import pyfftw pyfftw.default_threads = 4
I think that would be great, though probably slightly off-topic here.
import pyfftw.interfaces.numpy_fft as fft
The wisdom code can be added at the package import level too, but that doesn't need anything more.
If NumPy/SciPy provided interfaces to different FFT implementations, implementation specific routines (e.g. wisdom load/save or creation of byte-aligned empty arrays in the pyfftw case) could be made available through a subpackage, e.g. np.fft.implementation_specific. That subpackage then exposed routines specific to the implementation that lives below the simple interfaces. For implementation-specific configuration, perhaps a user-level configuration file or set of environment variables could be read on import of the specific implementation. At the very heart of allowing NumPy to use different FFT implementations could be a definition of an intermediate layer, much like LAPACK is for linear algebra. This probably would have to happen at the C-level. I'm only wildly speculating here as I don't have enough experience with interfaces to different FFT libraries, so I don't know whether the individual interfaces are close enough to be able to define a suitable "common interface". Alex
I think that numpy.fft should be left there in its current state (although perhaps as deprecated). Now scipy.fft should have a good generic algorithm as default, and easily allow for different implementations to be accessed through the same interface.
I also agree with the above. But I want to add that in this case it would be wise to include some (sophisticated) testing suite to ensure that all possible libraries implement the DFT with high accuracy. numpy's fftpack (or scipy's) has the advantage that it is old and well tested. FFTW also provides extensive benchmarks of speed and accuracy. Other libraries do not. Most users just want an fft function that works and not bother with details like numerical accuracy. When I encountered such an issue (https://github.com/hgomersall/pyFFTW/issues/51) it took me really a long time to track it down to the fft function. One remark to FFTS: does it implement double precision yet? The corresponding issue (https://github.com/anthonix/ffts/issues/24) seems to be still open but I did not look into it. If it does not it is not suited as a fftpack replacement (I hope I did not overlook some comment about that in the thread). Cheers Nils PS: although I am a long time user of numpy, I am fairly new to the list. So hello!
On 29/10/14 10:48, Eelco Hoogendoorn wrote:
Id rather have us discuss how to facilitate the integration of as many possible fft libraries with numpy behind a maximally uniform interface, rather than having us debate which fft library is 'best'.
I am happy with the NumPy interface. There are minor differences between the SciPy and NumPy FFT interfaces (e.g. for rfft, see below). Personally I prefer the NumPy interface because it makes it easier to map Fourier coeffs to frequencies. One thing we could do, without too much hassle, is to use FFTs from MKL or Accelerate (vDSP) if we link with these libararies for BLAS/LAPACK. MKL has an API compatible with FFTW, so FFTW and MKL can be supported with the same C code. FFTW and MKL also have a Fortran 77 API which we could wrap with f2py (no Fortran compiler are needed). It is actually possible to use the FFTs in FFTW and MKL from Python without any C coding at all. We just need to add a Python interface on top of the f2py wrappers, which is similar to what we do for scipy.linalg. The FFTs in Accelerate have a very simple C interface, but only support power-of-two array sizes, so we would need to use them with Bluestein's algorithm. Again, because of their simplicity, it is possible to wrap these FFT functions with f2py. We cannot bundle NumPy or SciPy binaries with FFTW due to GPL [*], but as I understand it we already have permission from Intel to bundle binary wheels linked with MKL. Accelerate is a system library, so that does not pose a license problem. [*] Actually, we could, but the binaries would be tainted with a viral license.
a = np.random.rand(8)
scipy.fftpack.rfft(a)[:,None]
array([[ 3.47756851], [-0.45869926], [-0.21730867], [-0.43763425], [-0.67338213], [-0.28799 ], [ 0.17321793], [-0.31514119]])
np.fft.rfft(a)[:,None]
array([[ 3.47756851+0.j ], [-0.45869926-0.21730867j], [-0.43763425-0.67338213j], [-0.28799000+0.17321793j], [-0.31514119+0.j ]]) Sturla
On 30 Oct 2014 03:58, "Sturla Molden" <sturla.molden@gmail.com> wrote: [...]
We cannot bundle NumPy or SciPy binaries with FFTW due to GPL [*], but as I understand it we already have permission from Intel to bundle binary wheels linked with MKL. Accelerate is a system library, so that does not pose a license problem.
[*] Actually, we could, but the binaries would be tainted with a viral license.
And binaries linked with MKL are tainted by a proprietary license... They have very similar effects, in both cases you can use the binary just fine for whatever you want, but if you redistribute it as part of a larger work, then you must (follow the terms of the GPL/follow the terms of Intel's license). -n
Nathaniel Smith <njs@pobox.com> wrote:
[*] Actually, we could, but the binaries would be tainted with a viral license.
And binaries linked with MKL are tainted by a proprietary license... They have very similar effects,
The MKL license is proprietary but not viral. Sturla
On Thu, Oct 30, 2014 at 11:11 AM, Sturla Molden <sturla.molden@gmail.com> wrote:
Nathaniel Smith <njs@pobox.com> wrote:
[*] Actually, we could, but the binaries would be tainted with a viral license.
And binaries linked with MKL are tainted by a proprietary license... They have very similar effects,
The MKL license is proprietary but not viral.
For our purposes, it has the same effect, though. As a project policy, we only want to put out official binaries that can be used in both proprietary and GPLed projects. Since proprietary licenses and GPL licenses are mutually incompatible, we cannot use components that are either proprietary or GPLed in our official binaries. -- Robert Kern
On 30 Oct 2014 11:12, "Sturla Molden" <sturla.molden@gmail.com> wrote:
Nathaniel Smith <njs@pobox.com> wrote:
[*] Actually, we could, but the binaries would be tainted with a viral license.
And binaries linked with MKL are tainted by a proprietary license...
They
have very similar effects,
The MKL license is proprietary but not viral.
If you like, but I think you are getting confused by the vividness of anti-GPL rhetoric. GPL and proprietary software are identical in that you have to pay some price if you want to legally redistribute derivative works (e.g. numpy + MKL/FFTW + other software). For proprietary software the price is money and other random more or less onerous conditions (e.g. anti-benchmarking and anti-reverse-engineering clauses are common). For GPL software the price is that you have to let people reuse your source code for free. That's literally all that "viral" means. Which of these prices you find more affordable will depend on your circumstances. Either way it's just something to take into account before redistributing "tainted" binaries. -n
On Thu, Oct 30, 2014 at 4:28 AM, Nathaniel Smith <njs@pobox.com> wrote:
On 30 Oct 2014 11:12, "Sturla Molden" <sturla.molden@gmail.com> wrote:
Nathaniel Smith <njs@pobox.com> wrote:
[*] Actually, we could, but the binaries would be tainted with a viral license.
And binaries linked with MKL are tainted by a proprietary license... They have very similar effects,
The MKL license is proprietary but not viral.
If you like, but I think you are getting confused by the vividness of anti-GPL rhetoric. GPL and proprietary software are identical in that you have to pay some price if you want to legally redistribute derivative works (e.g. numpy + MKL/FFTW + other software). For proprietary software the price is money and other random more or less onerous conditions (e.g. anti-benchmarking and anti-reverse-engineering clauses are common). For GPL software the price is that you have to let people reuse your source code for free. That's literally all that "viral" means.
I wrote a summary of the MKL license problems here: https://github.com/numpy/numpy/wiki/Numerical-software-on-Windows#blas--lapa... In summary, if you distribute something with the MKL you have to: * require your users to agree to a license forbidding them from reverse-engineering the MKL * indemnify Intel against being sued as a result of using MKL in your binaries I think the users are not allowed to further distribute any part of the MKL libraries, but I am happy to be corrected on that. Cheers, Matthew
On Thu, Oct 30, 2014 at 10:24 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
On Thu, Oct 30, 2014 at 4:28 AM, Nathaniel Smith <njs@pobox.com> wrote:
On 30 Oct 2014 11:12, "Sturla Molden" <sturla.molden@gmail.com> wrote:
Nathaniel Smith <njs@pobox.com> wrote:
[*] Actually, we could, but the binaries would be tainted with a viral license.
And binaries linked with MKL are tainted by a proprietary license... They have very similar effects,
The MKL license is proprietary but not viral.
If you like, but I think you are getting confused by the vividness of anti-GPL rhetoric. GPL and proprietary software are identical in that you have to pay some price if you want to legally redistribute derivative works (e.g. numpy + MKL/FFTW + other software). For proprietary software the price is money and other random more or less onerous conditions (e.g. anti-benchmarking and anti-reverse-engineering clauses are common). For GPL software the price is that you have to let people reuse your source code for free. That's literally all that "viral" means.
I wrote a summary of the MKL license problems here:
https://github.com/numpy/numpy/wiki/Numerical-software-on-Windows#blas--lapa...
In summary, if you distribute something with the MKL you have to:
* require your users to agree to a license forbidding them from reverse-engineering the MKL * indemnify Intel against being sued as a result of using MKL in your binaries
Sorry - I should point out that this last 'indemnify' clause is "including attorney's fees". Meaning that, if someone sues Intel because of your software, you have to pay Intel's attorney's fees. Matthew
On 30/10/14 03:58, Sturla Molden wrote:
MKL has an API compatible with FFTW, so FFTW and MKL can be supported with the same C code. Compatible with big caveats:
Jerome Kieffer <Jerome.Kieffer@esrf.fr> wrote:
Because the plan creation was taking ages with FFTw, numpy's FFTPACK was often faster (overall)
Matlab switched from FFTPACK to FFTW because the latter was faster in general. If FFTW guesses a plan it does not take very long. Actual measurements can be slow, however, but those are not needed. Sturla
On 28 Oct 2014 07:32, "Jerome Kieffer" <Jerome.Kieffer@esrf.fr> wrote:
On Tue, 28 Oct 2014 04:28:37 +0000 Nathaniel Smith <njs@pobox.com> wrote:
It's definitely attractive. Some potential issues that might need
dealing
with, based on a quick skim:
In my tests, numpy's FFTPACK isn't that bad considering * (virtually) no extra overhead for installation * (virtually) no plan creation time * not that slower for each transformation
Well, this is what makes FFTS intriguing :-). It's BSD licensed, so we could distribute it by default like we do fftpack, it uses cache-oblivious algorithms so it has no planning step, and even without planning it benchmarks as faster than FFTW's most expensive planning mode (in the cases that FFTS supports, i.e. power-of-two transforms). The paper has lots of benchmark graphs, including measurements of setup time: http://anthonix.com/ffts/preprints/tsp2013.pdf -n
I On Tue, Oct 28, 2014 at 2:31 PM, Nathaniel Smith <njs@pobox.com> wrote:
On 28 Oct 2014 07:32, "Jerome Kieffer" <Jerome.Kieffer@esrf.fr> wrote:
On Tue, 28 Oct 2014 04:28:37 +0000 Nathaniel Smith <njs@pobox.com> wrote:
It's definitely attractive. Some potential issues that might need
dealing
with, based on a quick skim:
In my tests, numpy's FFTPACK isn't that bad considering * (virtually) no extra overhead for installation * (virtually) no plan creation time * not that slower for each transformation
Well, this is what makes FFTS intriguing :-). It's BSD licensed, so we could distribute it by default like we do fftpack, it uses cache-oblivious algorithms so it has no planning step, and even without planning it benchmarks as faster than FFTW's most expensive planning mode (in the cases that FFTS supports, i.e. power-of-two transforms).
The paper has lots of benchmark graphs, including measurements of setup time: http://anthonix.com/ffts/preprints/tsp2013.pdf
Nice. In this case, the solution may be to implement the Bluestein transform to deal with prime/near-prime numbers on top of FFTS. I did not look much, but it did not obviously support building on windows as well ? David
-n
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Oct 28, 2014 at 3:06 PM, David Cournapeau <cournape@gmail.com> wrote:
I
On Tue, Oct 28, 2014 at 2:31 PM, Nathaniel Smith <njs@pobox.com> wrote:
On 28 Oct 2014 07:32, "Jerome Kieffer" <Jerome.Kieffer@esrf.fr> wrote:
On Tue, 28 Oct 2014 04:28:37 +0000 Nathaniel Smith <njs@pobox.com> wrote:
It's definitely attractive. Some potential issues that might need
dealing
with, based on a quick skim:
In my tests, numpy's FFTPACK isn't that bad considering * (virtually) no extra overhead for installation * (virtually) no plan creation time * not that slower for each transformation
Well, this is what makes FFTS intriguing :-). It's BSD licensed, so we could distribute it by default like we do fftpack, it uses cache-oblivious algorithms so it has no planning step, and even without planning it benchmarks as faster than FFTW's most expensive planning mode (in the cases that FFTS supports, i.e. power-of-two transforms).
The paper has lots of benchmark graphs, including measurements of setup time: http://anthonix.com/ffts/preprints/tsp2013.pdf
Nice. In this case, the solution may be to implement the Bluestein transform to deal with prime/near-prime numbers on top of FFTS.
I did not look much, but it did not obviously support building on windows as well ?
Ok, I took a quick look at it, and it will be a significant effort to be able to make FFTS work at all with MSVC on windows: - the code is not C89 compatible - it uses code generation using POSIX library. One would need to port that part to using Win32 API as well. - the test suite looks really limited (roundtripping only). The codebase does not seem particularly well written either (but neither is FFTPACK to be fair). Nothing impossible (looks like Sony at least uses this code on windows: https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but not a 2 hours thing either. David
On 28/10/14 16:50, David Cournapeau wrote:
Nothing impossible (looks like Sony at least uses this code on windows: https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but not a 2 hours thing either.
One of the downsides of the BSD license :) Cheers, Daniele
On 2014-10-28 19:37:17, Daniele Nicolodi <daniele@grinta.net> wrote:
On 28/10/14 16:50, David Cournapeau wrote:
Nothing impossible (looks like Sony at least uses this code on windows: https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but not a 2 hours thing either.
One of the downsides of the BSD license :)
Perhaps one of the upsides, as they may be willing to contribute back if asked nicely. Stéfan
On 28/10/14 18:44, Stefan van der Walt wrote:
On 2014-10-28 19:37:17, Daniele Nicolodi <daniele@grinta.net> wrote:
On 28/10/14 16:50, David Cournapeau wrote:
Nothing impossible (looks like Sony at least uses this code on windows: https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but not a 2 hours thing either.
One of the downsides of the BSD license :)
Perhaps one of the upsides, as they may be willing to contribute back if asked nicely.
If it would be GPL or similar the would have to, and there would not be need to ask nicely. Cheers, Daniele
On 2014-10-28 19:55:57, Daniele Nicolodi <daniele@grinta.net> wrote:
On 28/10/14 18:44, Stefan van der Walt wrote:
On 2014-10-28 19:37:17, Daniele Nicolodi <daniele@grinta.net> wrote:
On 28/10/14 16:50, David Cournapeau wrote:
Nothing impossible (looks like Sony at least uses this code on windows: https://github.com/anthonix/ffts/issues/27#issuecomment-40204403), but not a 2 hours thing either.
One of the downsides of the BSD license :)
Perhaps one of the upsides, as they may be willing to contribute back if asked nicely.
If it would be GPL or similar the would have to, and there would not be need to ask nicely.
But then they would not have written the code to start off with, so that point is moot. Stéfan
On 28/10/14 04:28, Nathaniel Smith wrote:
- not sure if it can handle non-power-of-two problems at all, or at all efficiently. (FFTPACK isn't great here either but major regressions would be bad.)
From my reading, this seems to be the biggest issue with FFTS (from my reading as well) and where FFTW really wins. Having a faster algorithm used when it will work, with fallback to fftpack (or something else) is a good solution IMO. Henry
On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith <njs@pobox.com> wrote:
On 28 Oct 2014 04:07, "Matthew Brett" <matthew.brett@gmail.com> wrote:
Hi,
On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden <sturla.molden@gmail.com>
wrote:
Sturla Molden <sturla.molden@gmail.com> wrote:
If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework,
I should perhaps also mention FFTS here, which claim to be faster than FFTW and has a BSD licence:
Nice. And a funny New Zealand name too.
Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW?
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:
- seems to have a hard requirement for a processor supporting SSE, AVX, or NEON. No fallback for old CPUs or other architectures. (I'm not even sure whether it has x86-32 support.)
- no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time decision
- not sure if it can handle non-power-of-two problems at all, or at all efficiently. (FFTPACK isn't great here either but major regressions would be bad.)
- not sure if it supports all the modes we care about (e.g. rfft)
This stuff is all probably solveable though, so if someone has a hankering to make numpy (or scipy) fft dramatically faster then you should get in touch with the author and see what they think.
-n
I recently became aware of another C-library for doing FFTs (and other things): https://github.com/arrayfire/arrayfire They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in to.
On Thu, Dec 11, 2014 at 10:41 AM, Todd <toddrjen@gmail.com> wrote:
On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith <njs@pobox.com> wrote:
On 28 Oct 2014 04:07, "Matthew Brett" <matthew.brett@gmail.com> wrote:
Hi,
On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden <sturla.molden@gmail.com>
Sturla Molden <sturla.molden@gmail.com> wrote:
If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework,
I should perhaps also mention FFTS here, which claim to be faster
wrote: than FFTW
and has a BSD licence:
Nice. And a funny New Zealand name too.
Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW?
It's definitely attractive. Some potential issues that might need dealing with, based on a quick skim:
- seems to have a hard requirement for a processor supporting SSE, AVX, or NEON. No fallback for old CPUs or other architectures. (I'm not even sure whether it has x86-32 support.)
- no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time decision
- not sure if it can handle non-power-of-two problems at all, or at all efficiently. (FFTPACK isn't great here either but major regressions would be bad.)
- not sure if it supports all the modes we care about (e.g. rfft)
This stuff is all probably solveable though, so if someone has a hankering to make numpy (or scipy) fft dramatically faster then you should get in touch with the author and see what they think.
-n
I recently became aware of another C-library for doing FFTs (and other things):
https://github.com/arrayfire/arrayfire
They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in
AFAICT the cpu backend is a FFTW wrapper. Eric
On Thu, Dec 11, 2014 at 3:53 PM, Eric Moore <ewm@redtetrahedron.org> wrote:
On Thu, Dec 11, 2014 at 10:41 AM, Todd <toddrjen@gmail.com> wrote:
I recently became aware of another C-library for doing FFTs (and other
things):
https://github.com/arrayfire/arrayfire
They claim to have comparable FFT performance to MKL when run on a CPU
(they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in
AFAICT the cpu backend is a FFTW wrapper.
Indeed. https://github.com/arrayfire/arrayfire/blob/devel/src/backend/cpu/fft.cpp#L1... -- Robert Kern
On Thu, Dec 11, 2014 at 4:55 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Thu, Dec 11, 2014 at 3:53 PM, Eric Moore <ewm@redtetrahedron.org> wrote:
On Thu, Dec 11, 2014 at 10:41 AM, Todd <toddrjen@gmail.com> wrote:
I recently became aware of another C-library for doing FFTs (and other things):
https://github.com/arrayfire/arrayfire
They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in
AFAICT the cpu backend is a FFTW wrapper.
Indeed. https://github.com/arrayfire/arrayfire/blob/devel/src/backend/cpu/fft.cpp#L1...
Oh, nevermind then.
participants (17)
-
Alexander Eberspächer
-
Charles R Harris
-
Daniele Nicolodi
-
David Cournapeau
-
Daπid
-
Eelco Hoogendoorn
-
Eric Moore
-
Henry Gomersall
-
Jerome Kieffer
-
Matthew Brett
-
Nathaniel Smith
-
Nils Becker
-
Pierre-Andre Noel
-
Robert Kern
-
Stefan van der Walt
-
Sturla Molden
-
Todd