Division and right shift in python

Dave Angel davea at ieee.org
Wed Aug 26 08:18:15 EDT 2009


Mark Tolonen wrote:
> <div class="moz-text-flowed" style="font-family: -moz-fixed">
> "Cevahir Demirkiran" <cevomed at gmail.com> wrote in message 
> news:3f74e020908251648k7b391a09g78b155507b2f23c4 at mail.gmail.com...
>> Hi,
>>
>> I would like to do a floor division by a power of 2 in python to make it
>> faster than / (modular division in 2.x).
>> However, it is slower.
>> What is the reason of that?
>> I checked them via:
>>
>> def f2(x,n):
>>    t1 = clock()
>>    r = x/pow(2,n)
>>    t2 = clock()
>>    print (t2-t1)
>>    print r
>>    t2 = clock()
>>    r = x>>n
>>    t3 = clock()
>>    print (t3-t2)
>>    print r
>>
>
> It's not slower on my system, but note the inconsistent results also:
>
>>>> f2(1024,5)
> 3.47396033483e-06
> 32
> 2.19077375482e-06
> 32
>>>> f2(1024,5)
> 4.84135603429e-06
> 32
> 3.08499440393e-06
> 32
>>>> f2(1024,5)
> 4.6782844052e-06
> 32
> 3.77604384028e-06
> 32
>
> Time it with timeit...
>
> C:\>python -m timeit -n 10000000 -s x=1024 "x>>5"
> 10000000 loops, best of 3: 0.113 usec per loop
>
> C:\>python -m timeit -n 10000000 -s x=1024 "x/pow(2,5)"
> 10000000 loops, best of 3: 0.468 usec per loop
>
> Right-shift is over 4x faster.
>
> -Mark
>
>
>
> </div>
>
There have been languages and processors in which right shift has been 
notably faster than divide.  But with Python on a Pentium, you're both 
just measuring overhead.  The actual divide or shift is probably less 
than 5% of the time measured.

And that overhead will vary between versions of the interpreter.  I 
measured with Python 2.6.2 on Windows XP, and got

C:\>python -m timeit -n 10000000 -s x=1024 "x>>5"
10000000 loops, best of 3: 0.112 usec per loop

C:\>python -m timeit -n 10000000 -s x=1024 "x/pow(2,5)"
10000000 loops, best of 3: 0.384 usec per loop

But the real overhead is in the function call to pow().  If that had 
been precalculated, then the time for divide is nearly as fast as for 
right shift.

C:\>python -m timeit -n 10000000 -s x=1024 "x/32"
10000000 loops, best of 3: 0.135 usec per loop

And I suspect that some other implementation might well show this 
quicker than the right shift.

I would use whichever code most clearly represents the problem being 
solved.  There are undoubtedly other ways to speed up the code, without 
trying to micro-optimize one expression.

DaveA



More information about the Python-list mailing list