Terrible FPU performance

Mihai Badoiu mbadoiu at gmail.com
Tue Apr 26 15:40:36 CEST 2011


I have terrible performance for multiplication when one number gets very
close to zero.  I'm using cython by writing the following code:

    cdef int i
    cdef double x = 1.0
    for 0 <= i < 10000000:
        x *= 0.8
        #x += 0.01
    print x

This code runs much much slower (20+ times slower) with the line x += 0.01
uncommented.  I looked at the deassembled code and it looks correct.
 Moreover, it's just a few lines and by writing a C code (without python on
top), I get the same code, but it's much faster.  I've also tried using sse,
but I get exactly the same behavior.  The best candidate that I see so far
is that Python sets up the FPU in a different state than C.

Any advice on how to solve this performance problem?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110426/2b2362ab/attachment.html>

More information about the Python-list mailing list