Terrible FPU performance
Mihai Badoiu
mbadoiu at gmail.com
Tue Apr 26 09:40:36 EDT 2011
Hi,
I have terrible performance for multiplication when one number gets very
close to zero. I'm using cython by writing the following code:
cdef int i
cdef double x = 1.0
for 0 <= i < 10000000:
x *= 0.8
#x += 0.01
print x
This code runs much much slower (20+ times slower) with the line x += 0.01
uncommented. I looked at the deassembled code and it looks correct.
Moreover, it's just a few lines and by writing a C code (without python on
top), I get the same code, but it's much faster. I've also tried using sse,
but I get exactly the same behavior. The best candidate that I see so far
is that Python sets up the FPU in a different state than C.
Any advice on how to solve this performance problem?
thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110426/2b2362ab/attachment.html>
More information about the Python-list
mailing list