Terrible FPU performance
mbadoiu at gmail.com
Wed Apr 27 15:04:39 CEST 2011
I'm using intel xeon harpertown (E5450) and Python 2.6.4.
In the cython code, when I use fpclassify, in the slow case I get 3
In the pure-C code, when I use fpclassify, in the case that's supposed to be
slow but it's not, I get a 2 (FP_ZERO)
Somehow, the FPU's have different results for exactly the same asm code.
On Tue, Apr 26, 2011 at 11:06 PM, David Cournapeau <cournape at gmail.com>wrote:
> On Wed, Apr 27, 2011 at 4:14 AM, Dan Goodman <dg.gmane at thesamovar.net>
> > Hi,
> > On 26/04/2011 15:40, Mihai Badoiu wrote:
> >> I have terrible performance for multiplication when one number gets very
> >> close to zero. I'm using cython by writing the following code:
> > This might be an issue with denormal numbers:
> > http://en.wikipedia.org/wiki/Denormal_number
> > I don't know much about them though, so I can't advise any further than
> > that...
> This indeed sounds like it. Mihai, which CPU are you using ? Pentium4
> are especially known to have terrible (read order of magnitude slower)
> performance with denormal numbers.
> There is unfortunately no simple way to know whether a float is
> denormal or not in python, but since you are using cython, if you are
> under posix you should be able to use fpclassify to check this,
> From there, if you see a difference between cython/python and C, it
> will be easier to debug.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list