[Numpy-discussion] UFUNC_CHECK_STATUS cpu hog

Thomas Grill grrrr.org at gmail.com
Sat Mar 1 22:32:25 EST 2008


Am 02.03.2008 um 04:24 schrieb Travis E. Oliphant:

> Thomas Grill wrote:
>> Hi all,
>> i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was
>> surprised to find calls to the system function feclearexcept to be by
>> far the biggest cpu hog, taking away about 30% of the cpu in my case.
>> Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in
>> a way that feclearexcept is only called when necessary (fpstatus !=
>> 0), like in
>>
>> ufuncobject.h, line 292....
>>
>> #define UFUNC_CHECK_STATUS(ret)  
>> {                                       \
>>    int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW  
>> |    \
>>                      FE_UNDERFLOW | FE_INVALID);    \
>>   if(__builtin_expect(fpstatus,0)) \
>
> Why the use of __builtin_expect here instead of fpstatus == 0?

It's a branch hint for gcc, as fpstatus is very likely to be 0.
If portability to older gcc versions is important, fpstatus == 0 is a  
better choice.


greetings,
Thomas

--
Thomas Grill
http://grrrr.org


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2407 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080302/6cd2da10/attachment.bin>


More information about the NumPy-Discussion mailing list