
Dear list! I noticed that integer division (`//`) is slower than floating-point division and integer cast. For example: from __future__ import division from time import time t = time() for i in range(1, 10000): for j in range(1, 10000): # k = i//j # 2.12 seconds k = int(i/j) # 0.98 seconds print time() - t I now integer division should be slower, but I thought that the `int()` would make floating division even slower. Please, can someone explain what is going on? Is this expected behaviour? Thank you!

Hi, On 31 May 2017 at 17:11, Tuom Larsen <tuom.larsen@gmail.com> wrote:
# k = i//j # 2.12 seconds k = int(i/j) # 0.98 seconds
Note first that if you don't do anything with 'k', it might be optimized away. I just wrote a pure C example doing the same thing, and indeed converting the integers to float, dividing, and then converting back to integer... is 2.2x times faster there too. Go figure it out. I have no idea why the CPU behaves like that. Maybe Neal can provide a clue. A bientôt, Armin.

On 1 June 2017 at 20:53, Armin Rigo <armin.rigo@gmail.com> wrote:
I took a look at this earlier today. It turns out that, on Skylake (and I think it's similar on other recent x86_64 implementations), 80-bit floating point division has a latency of 14 cycles, where 32 bit integer division has a latency of 26 cycles. I expect this is because there are only two hardware division units, and both are on the floating point path. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement.
participants (4)
-
Armin Rigo
-
Neal Becker
-
Tuom Larsen
-
William ML Leslie