Bug in floating point multiplication
Oscar Benjamin
oscar.j.benjamin at gmail.com
Fri Jul 3 11:13:58 EDT 2015
On 2 July 2015 at 18:29, Jason Swails <jason.swails at gmail.com> wrote:
>
> As others have suggested, this is almost certainly a 32-bit vs. 64-bit
> issue. Consider the following C program:
>
> // maths.h
> #include <math.h>
> #include <stdio.h>
>
> int main() {
> double x;
> int i;
> x = 1-pow(0.5, 53);
>
> for (i = 1; i < 1000000; i++) {
> if ((int)(i*x) == i) {
> printf("%d\n", i);
> break;
> }
> }
>
> return 0;
> }
>
> For the most part, this should be as close to an exact transliteration of
> your Python code as possible.
>
> Here's what I get when I try compiling and running it on my 64-bit (Gentoo)
> Linux machine with 32-bit compatible libs:
>
> swails at batman ~/test $ gcc maths.c
> swails at batman ~/test $ ./a.out
> swails at batman ~/test $ gcc -m32 maths.c
> swails at batman ~/test $ ./a.out
> 2049
I was unable to reproduce this on my system. In both cases the loops
run to completion. A look at the assembly generated by gcc shows that
something different goes on there though.
The loop in the 64 bit one (in the main function) looks like:
$ objdump -d a.out | less
...
400555: pxor %xmm0,%xmm0
400559: cvtsi2sdl -0xc(%rbp),%xmm0
40055e: mulsd -0x8(%rbp),%xmm0
400563: cvttsd2si %xmm0,%eax
400567: cmp -0xc(%rbp),%eax
40056a: jne 400582 <main+0x4c>
40056c: mov -0xc(%rbp),%eax
40056f: mov %eax,%esi
400571: mov $0x400624,%edi
400576: mov $0x0,%eax
40057b: callq 400410 <printf at plt>
400580: jmp 40058f <main+0x59>
400582: addl $0x1,-0xc(%rbp)
400586: cmpl $0xf423f,-0xc(%rbp)
40058d: jle 400555 <main+0x1f>
...
Where is the 32 bit one looks like:
$ objdump -d a.out.32 | less
...
804843e: fildl -0x14(%ebp)
8048441: fmull -0x10(%ebp)
8048444: fnstcw -0x1a(%ebp)
8048447: movzwl -0x1a(%ebp),%eax
804844b: mov $0xc,%ah
804844d: mov %ax,-0x1c(%ebp)
8048451: fldcw -0x1c(%ebp)
8048454: fistpl -0x20(%ebp)
8048457: fldcw -0x1a(%ebp)
804845a: mov -0x20(%ebp),%eax
804845d: cmp -0x14(%ebp),%eax
8048460: jne 8048477 <main+0x5c>
8048462: sub $0x8,%esp
8048465: pushl -0x14(%ebp)
8048468: push $0x8048520
804846d: call 80482f0 <printf at plt>
8048472: add $0x10,%esp
8048475: jmp 8048484 <main+0x69>
8048477: addl $0x1,-0x14(%ebp)
804847b: cmpl $0xf423f,-0x14(%ebp)
8048482: jle 804843e <main+0x23>
...
So the 64 bit one is using SSE instructions and the 32-bit one is
using x87. That could explain the difference you see at the C level
but I don't see it on this CPU (/proc/cpuinfo says Intel(R) Core(TM)
i5-3427U CPU @ 1.80GHz).
--
Oscar
More information about the Python-list
mailing list