[Python-Dev] Optionally using GMP to implement long if available

Mon Nov 10 18:42:05 CET 2008

On Mon, Nov 10, 2008 at 4:26 PM, Nick Craig-Wood <nick at craig-wood.com> wrote:
>
> Looking at the assembler it produces (x86)
>
> mul:
>        pushl   %ebp
>        xorl    %edx, %edx
>        movl    %esp, %ebp
>        movl    12(%ebp), %eax
>        imull   8(%ebp), %eax
>        popl    %ebp
>        ret
>
> Which I'm pretty sure is a 32x32->64 bit mul (though my x86 assembler
> foo is weak).

My x86 assembler is also weak (or perhaps I should say nonexistent),
but I think this does exactly what the C standard says it should: that is,
it returns just the low 32-bits of the product.

Looking at the assembler, I think the imull does a 32-bit by
32-bit multiply and puts its (truncated) result back into the 32-bit
register eax.  I'd guess that the 64-bit result is being returned to
the calling routine in registers edx (high 32 bits) and eax (low 32 bits);
this explains why edx has to be zeroed with the 'xorl' instruction.

And if we were really expecting a 64-bit result then there should
be an unsigned multiply (mull) there instead of a signed multiply
(imull);  of course they're the same modulo 2**32, so for a 32-bit
result it doesn't matter which is used.

Mark