[pypy-dev] Integer division

William ML Leslie william.leslie.ttg at gmail.com
Thu Jun 1 07:25:52 EDT 2017


On 1 June 2017 at 20:53, Armin Rigo <armin.rigo at gmail.com> wrote:

> Hi,
>
> On 31 May 2017 at 17:11, Tuom Larsen <tuom.larsen at gmail.com> wrote:
> >             # k = i//j # 2.12 seconds
> >             k = int(i/j) # 0.98 seconds
>
> Note first that if you don't do anything with 'k', it might be optimized
> away.
>
> I just wrote a pure C example doing the same thing, and indeed
> converting the integers to float, dividing, and then converting back
> to integer... is 2.2x times faster there too.
>
> Go figure it out.  I have no idea why the CPU behaves like that.
> Maybe Neal can provide a clue.
>

​I took a look at this earlier today.  It turns out that, on Skylake (and I
think it's similar on other recent x86_64 implementations), 80-bit floating
point division has a latency of 14 cycles, where 32 bit integer division
has a latency of 26 cycles.​  I expect this is because there are only two
hardware division units, and both are on the floating point path.

-- 
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered under
copyright law.  You absolutely MAY reproduce any part of it in accordance
with the copyright law of the nation you are reading this in.  Any attempt
to DENY YOU THOSE RIGHTS would be illegal without prior contractual
agreement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20170601/a63e4eba/attachment.html>


More information about the pypy-dev mailing list