[Numpy-discussion] UFUNC_CHECK_STATUS cpu hog

Travis E. Oliphant oliphant at enthought.com
Sat Mar 1 22:24:04 EST 2008


Thomas Grill wrote:
> Hi all,
> i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was 
> surprised to find calls to the system function feclearexcept to be by 
> far the biggest cpu hog, taking away about 30% of the cpu in my case. 
> Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in 
> a way that feclearexcept is only called when necessary (fpstatus != 
> 0), like in
>
> ufuncobject.h, line 292....
>
> #define UFUNC_CHECK_STATUS(ret) {                                       \
>     int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW |    \
>                       FE_UNDERFLOW | FE_INVALID);    \
>    if(__builtin_expect(fpstatus,0)) \

Why the use of __builtin_expect here instead of fpstatus == 0?

>        ret = 0; \
>    else { \
>        ret = ((FE_DIVBYZERO  & fpstatus) ? UFUNC_FPE_DIVIDEBYZERO : 0) \
>            | ((FE_OVERFLOW   & fpstatus) ? UFUNC_FPE_OVERFLOW : 0)    \
>            | ((FE_UNDERFLOW  & fpstatus) ? UFUNC_FPE_UNDERFLOW : 0) \
>            | ((FE_INVALID    & fpstatus) ? UFUNC_FPE_INVALID : 0);    \
>        (void) feclearexcept(FE_DIVBYZERO | FE_OVERFLOW |        \
>                     FE_UNDERFLOW | FE_INVALID);        \
>    } \
> }
I don't see a problem with this...

-Travis O.




More information about the NumPy-Discussion mailing list