I hope this is the right place to post this. A user on StackOverflow told me to <http://stackoverflow.com/questions/29170588/will-cython-speed-up-erf-calculations> report this. I am trying to transition from MATLAB to Python. The majority of my computational time is spent calling erf on millions or billions of vectors. Unfortunately, it seems that scipy.special.erf() takes about 3 times as long as MATLAB’s erf(). Is there anything that can be done to speed up SciPy’s erf()? Check for yourself if you wish: MATLAB r=rand(1,1e7) tic;erf(r);toc % repeat this line a few times Python import numpy as np import scipy.special as sps r=np.random.rand(1e7) %timeit sps.erf(r) Thanks! Will Adler PhD Candidate Ma Lab <http://www.cns.nyu.edu/malab/> Center for Neural Science New York University
20.03.2015, 20:29, Will Adler kirjoitti: [clip]
Is there anything that can be done to speed up SciPy’s erf()?
Possibly. https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483 The simplest thing would probably be just to write the Pade approximant in a form the C compiler can inline. erf and erfc are also in C99, so glibc may have a fast implementation.
On 20.03.2015 22:08, Pauli Virtanen wrote:
20.03.2015, 20:29, Will Adler kirjoitti: [clip]
Is there anything that can be done to speed up SciPy’s erf()?
Possibly.
https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483
The simplest thing would probably be just to write the Pade approximant in a form the C compiler can inline. erf and erfc are also in C99, so glibc may have a fast implementation.
using glibc is unlikely to be faster, as they focus on correctness and not speed. Though its worth a try. The two 4 coefficient evaluations can be perfectly vectorized, just needs rearranging the static coefficient tables, that should give a decent speedup. Also the isnan call could be turned into a builtin instead of the function call gcc/glibc does. In total with this implementation I guess 40-50% improvement should be possible.
I have not tested this, but I suspect that the MATLAB routine is using the erf implementation from the Intel Math Kernel Libraries (MKL). There is a function in MKL called vdErf that takes a vector of doubles and is likely tuned to the hardware. This could be linked to NumPy with similar speed benefits. -Travis On Fri, Mar 20, 2015 at 5:02 PM, Julian Taylor < jtaylor.debian@googlemail.com> wrote:
On 20.03.2015 22:08, Pauli Virtanen wrote:
20.03.2015, 20:29, Will Adler kirjoitti: [clip]
Is there anything that can be done to speed up SciPy’s erf()?
Possibly.
https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483
The simplest thing would probably be just to write the Pade approximant in a form the C compiler can inline. erf and erfc are also in C99, so glibc may have a fast implementation.
using glibc is unlikely to be faster, as they focus on correctness and not speed. Though its worth a try.
The two 4 coefficient evaluations can be perfectly vectorized, just needs rearranging the static coefficient tables, that should give a decent speedup. Also the isnan call could be turned into a builtin instead of the function call gcc/glibc does. In total with this implementation I guess 40-50% improvement should be possible. _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
-- Travis Oliphant CEO Continuum Analytics, Inc. http://www.continuum.io
Am 21.03.2015 um 05:13 schrieb Travis Oliphant <travis@continuum.io>:
I have not tested this, but I suspect that the MATLAB routine is using the erf implementation from the Intel Math Kernel Libraries (MKL).
There is a function in MKL called vdErf that takes a vector of doubles and is likely tuned to the hardware. This could be linked to NumPy with similar speed benefits.
-Travis
The uvml module https://github.com/geggo/uvml <https://github.com/geggo/uvml> exposes the fast MKL/VML erf implementation to numpy. Unfortunately, no binaries available. Gregor
Integrating vector math libraries into numpy is actually a gsoc topic this year. as erf is a C99 function it should probably also move to numpy. Also the inlining Paul suggested works great, see: https://github.com/scipy/scipy/pull/4653 On 21.03.2015 05:13, Travis Oliphant wrote:
I have not tested this, but I suspect that the MATLAB routine is using the erf implementation from the Intel Math Kernel Libraries (MKL).
There is a function in MKL called vdErf that takes a vector of doubles and is likely tuned to the hardware. This could be linked to NumPy with similar speed benefits.
-Travis
On Fri, Mar 20, 2015 at 5:02 PM, Julian Taylor <jtaylor.debian@googlemail.com <mailto:jtaylor.debian@googlemail.com>> wrote:
On 20.03.2015 22:08, Pauli Virtanen wrote: > 20.03.2015, 20:29, Will Adler kirjoitti: > [clip] >> Is there anything that can be done to speed up SciPy’s erf()? > > Possibly. > > https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483 > > The simplest thing would probably be just to write the Pade approximant > in a form the C compiler can inline. erf and erfc are also in C99, so > glibc may have a fast implementation. >
using glibc is unlikely to be faster, as they focus on correctness and not speed. Though its worth a try.
The two 4 coefficient evaluations can be perfectly vectorized, just needs rearranging the static coefficient tables, that should give a decent speedup. Also the isnan call could be turned into a builtin instead of the function call gcc/glibc does. In total with this implementation I guess 40-50% improvement should be possible. _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org <mailto:SciPy-Dev@scipy.org> http://mail.scipy.org/mailman/listinfo/scipy-dev
--
Travis Oliphant CEO Continuum Analytics, Inc. http://www.continuum.io
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
participants (5)
-
Gregor Thalhammer
-
Julian Taylor
-
Pauli Virtanen
-
Travis Oliphant
-
Will Adler