Bug in floating point multiplication
Jason Swails
jason.swails at gmail.com
Fri Jul 3 21:12:39 EDT 2015
On Fri, Jul 3, 2015 at 11:13 AM, Oscar Benjamin <oscar.j.benjamin at gmail.com>
wrote:
> On 2 July 2015 at 18:29, Jason Swails <jason.swails at gmail.com> wrote:
> >
> > As others have suggested, this is almost certainly a 32-bit vs. 64-bit
> > issue. Consider the following C program:
> >
> > // maths.h
> > #include <math.h>
> > #include <stdio.h>
> >
> > int main() {
> > double x;
> > int i;
> > x = 1-pow(0.5, 53);
> >
> > for (i = 1; i < 1000000; i++) {
> > if ((int)(i*x) == i) {
> > printf("%d\n", i);
> > break;
> > }
> > }
> >
> > return 0;
> > }
> >
> > For the most part, this should be as close to an exact transliteration of
> > your Python code as possible.
> >
> > Here's what I get when I try compiling and running it on my 64-bit
> (Gentoo)
> > Linux machine with 32-bit compatible libs:
> >
> > swails at batman ~/test $ gcc maths.c
> > swails at batman ~/test $ ./a.out
> > swails at batman ~/test $ gcc -m32 maths.c
> > swails at batman ~/test $ ./a.out
> > 2049
>
> I was unable to reproduce this on my system. In both cases the loops
> run to completion. A look at the assembly generated by gcc shows that
> something different goes on there though.
>
> The loop in the 64 bit one (in the main function) looks like:
>
> $ objdump -d a.out | less
> ...
> 400555: pxor %xmm0,%xmm0
> 400559: cvtsi2sdl -0xc(%rbp),%xmm0
> 40055e: mulsd -0x8(%rbp),%xmm0
> 400563: cvttsd2si %xmm0,%eax
> 400567: cmp -0xc(%rbp),%eax
> 40056a: jne 400582 <main+0x4c>
> 40056c: mov -0xc(%rbp),%eax
> 40056f: mov %eax,%esi
> 400571: mov $0x400624,%edi
> 400576: mov $0x0,%eax
> 40057b: callq 400410 <printf at plt>
> 400580: jmp 40058f <main+0x59>
> 400582: addl $0x1,-0xc(%rbp)
> 400586: cmpl $0xf423f,-0xc(%rbp)
> 40058d: jle 400555 <main+0x1f>
> ...
>
> Where is the 32 bit one looks like:
>
> $ objdump -d a.out.32 | less
> ...
> 804843e: fildl -0x14(%ebp)
> 8048441: fmull -0x10(%ebp)
> 8048444: fnstcw -0x1a(%ebp)
> 8048447: movzwl -0x1a(%ebp),%eax
> 804844b: mov $0xc,%ah
> 804844d: mov %ax,-0x1c(%ebp)
> 8048451: fldcw -0x1c(%ebp)
> 8048454: fistpl -0x20(%ebp)
> 8048457: fldcw -0x1a(%ebp)
> 804845a: mov -0x20(%ebp),%eax
> 804845d: cmp -0x14(%ebp),%eax
> 8048460: jne 8048477 <main+0x5c>
> 8048462: sub $0x8,%esp
> 8048465: pushl -0x14(%ebp)
> 8048468: push $0x8048520
> 804846d: call 80482f0 <printf at plt>
> 8048472: add $0x10,%esp
> 8048475: jmp 8048484 <main+0x69>
> 8048477: addl $0x1,-0x14(%ebp)
> 804847b: cmpl $0xf423f,-0x14(%ebp)
> 8048482: jle 804843e <main+0x23>
> ...
>
> So the 64 bit one is using SSE instructions and the 32-bit one is
> using x87. That could explain the difference you see at the C level
> but I don't see it on this CPU (/proc/cpuinfo says Intel(R) Core(TM)
> i5-3427U CPU @ 1.80GHz).
>
Hmm. Well that could explain why you don't get the same results as me.
My CPU is a
AMD FX(tm)-6100 Six-Core Processor
(from /proc/cpuinfo). My objdump looks the same as yours for the 64-bit
version, but for 32-bit it looks like:
...
804843a: db 44 24 14 fildl 0x14(%esp)
804843e: dc 4c 24 18 fmull 0x18(%esp)
8048442: dd 5c 24 08 fstpl 0x8(%esp)
8048446: f2 0f 2c 44 24 08 cvttsd2si 0x8(%esp),%eax
804844c: 3b 44 24 14 cmp 0x14(%esp),%eax
8048450: 75 16 jne 8048468 <main+0x4b>
8048452: 8b 44 24 14 mov 0x14(%esp),%eax
8048456: 89 44 24 04 mov %eax,0x4(%esp)
804845a: c7 04 24 10 85 04 08 movl $0x8048510,(%esp)
8048461: e8 8a fe ff ff call 80482f0 <printf at plt>
8048466: eb 0f jmp 8048477 <main+0x5a>
8048468: 83 44 24 14 01 addl $0x1,0x14(%esp)
804846d: 81 7c 24 14 3f 42 0f cmpl $0xf423f,0x14(%esp)
8048474: 00
8048475: 7e c3 jle 804843a <main+0x1d>
...
However, I have no experience looking at raw assembler, so I can't discern
what it is I'm even looking at (nor do I know what explicit SSE
instructions look like in assembler).
I have a Mac that runs an Intel Core i5, and, like you, both 32- and 64-bit
versions run to completion. Which is at least consistent with what others
are seeing with Python.
All the best,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150703/c55981d3/attachment.html>
More information about the Python-list
mailing list