[Numpy-discussion] UFUNC_CHECK_STATUS cpu hog
Travis E. Oliphant
oliphant at enthought.com
Sat Mar 1 22:24:04 EST 2008
Thomas Grill wrote:
> Hi all,
> i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was
> surprised to find calls to the system function feclearexcept to be by
> far the biggest cpu hog, taking away about 30% of the cpu in my case.
> Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in
> a way that feclearexcept is only called when necessary (fpstatus !=
> 0), like in
>
> ufuncobject.h, line 292....
>
> #define UFUNC_CHECK_STATUS(ret) { \
> int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW | \
> FE_UNDERFLOW | FE_INVALID); \
> if(__builtin_expect(fpstatus,0)) \
Why the use of __builtin_expect here instead of fpstatus == 0?
> ret = 0; \
> else { \
> ret = ((FE_DIVBYZERO & fpstatus) ? UFUNC_FPE_DIVIDEBYZERO : 0) \
> | ((FE_OVERFLOW & fpstatus) ? UFUNC_FPE_OVERFLOW : 0) \
> | ((FE_UNDERFLOW & fpstatus) ? UFUNC_FPE_UNDERFLOW : 0) \
> | ((FE_INVALID & fpstatus) ? UFUNC_FPE_INVALID : 0); \
> (void) feclearexcept(FE_DIVBYZERO | FE_OVERFLOW | \
> FE_UNDERFLOW | FE_INVALID); \
> } \
> }
I don't see a problem with this...
-Travis O.
More information about the NumPy-Discussion
mailing list