# Division and right shift in python

Dave Angel davea at ieee.org
Wed Aug 26 14:18:15 CEST 2009

```Mark Tolonen wrote:
> <div class="moz-text-flowed" style="font-family: -moz-fixed">
> "Cevahir Demirkiran" <cevomed at gmail.com> wrote in message
> news:3f74e020908251648k7b391a09g78b155507b2f23c4 at mail.gmail.com...
>> Hi,
>>
>> I would like to do a floor division by a power of 2 in python to make it
>> faster than / (modular division in 2.x).
>> However, it is slower.
>> What is the reason of that?
>> I checked them via:
>>
>> def f2(x,n):
>>    t1 = clock()
>>    r = x/pow(2,n)
>>    t2 = clock()
>>    print (t2-t1)
>>    print r
>>    t2 = clock()
>>    r = x>>n
>>    t3 = clock()
>>    print (t3-t2)
>>    print r
>>
>
> It's not slower on my system, but note the inconsistent results also:
>
>>>> f2(1024,5)
> 3.47396033483e-06
> 32
> 2.19077375482e-06
> 32
>>>> f2(1024,5)
> 4.84135603429e-06
> 32
> 3.08499440393e-06
> 32
>>>> f2(1024,5)
> 4.6782844052e-06
> 32
> 3.77604384028e-06
> 32
>
> Time it with timeit...
>
> C:\>python -m timeit -n 10000000 -s x=1024 "x>>5"
> 10000000 loops, best of 3: 0.113 usec per loop
>
> C:\>python -m timeit -n 10000000 -s x=1024 "x/pow(2,5)"
> 10000000 loops, best of 3: 0.468 usec per loop
>
> Right-shift is over 4x faster.
>
> -Mark
>
>
>
> </div>
>
There have been languages and processors in which right shift has been
notably faster than divide.  But with Python on a Pentium, you're both
just measuring overhead.  The actual divide or shift is probably less
than 5% of the time measured.

And that overhead will vary between versions of the interpreter.  I
measured with Python 2.6.2 on Windows XP, and got

C:\>python -m timeit -n 10000000 -s x=1024 "x>>5"
10000000 loops, best of 3: 0.112 usec per loop

C:\>python -m timeit -n 10000000 -s x=1024 "x/pow(2,5)"
10000000 loops, best of 3: 0.384 usec per loop

But the real overhead is in the function call to pow().  If that had
been precalculated, then the time for divide is nearly as fast as for
right shift.

C:\>python -m timeit -n 10000000 -s x=1024 "x/32"
10000000 loops, best of 3: 0.135 usec per loop

And I suspect that some other implementation might well show this
quicker than the right shift.

I would use whichever code most clearly represents the problem being
solved.  There are undoubtedly other ways to speed up the code, without
trying to micro-optimize one expression.

DaveA

```