[Numpy-discussion] UFUNC_CHECK_STATUS cpu hog
Thomas Grill
grrrr.org at gmail.com
Sat Mar 1 22:32:25 EST 2008
Am 02.03.2008 um 04:24 schrieb Travis E. Oliphant:
> Thomas Grill wrote:
>> Hi all,
>> i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was
>> surprised to find calls to the system function feclearexcept to be by
>> far the biggest cpu hog, taking away about 30% of the cpu in my case.
>> Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in
>> a way that feclearexcept is only called when necessary (fpstatus !=
>> 0), like in
>>
>> ufuncobject.h, line 292....
>>
>> #define UFUNC_CHECK_STATUS(ret)
>> { \
>> int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW
>> | \
>> FE_UNDERFLOW | FE_INVALID); \
>> if(__builtin_expect(fpstatus,0)) \
>
> Why the use of __builtin_expect here instead of fpstatus == 0?
It's a branch hint for gcc, as fpstatus is very likely to be 0.
If portability to older gcc versions is important, fpstatus == 0 is a
better choice.
greetings,
Thomas
--
Thomas Grill
http://grrrr.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2407 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080302/6cd2da10/attachment.bin>
More information about the NumPy-Discussion
mailing list