Speed-up for loops

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Wed Sep 8 15:13:09 CEST 2010

```On Tue, 07 Sep 2010 11:00:03 +0100, BartC wrote:

>> for i in xrange(100000000):
>>    a = a + f(i)
>>
>> then unrolling the loop is even less useful. The overhead of the loop
>> itself is likely to be trivial compared to the cost of calling f() 100
>> million times -- the added complexity to shave 3 seconds off a four
>> minute calculation simply isn't worth it.
>
> With Python 3 and def f(x): return x+1, unrolling this loop 4x improved
> speed by 15%; 4.00 minutes reduces to 3.30 minutes.

I'm afraid that I can't replicate those figures. In my test, unrolling
the loop causes a massive SLOWDOWN of 37%, not a speed up. Here is my
test code:

def f(x):
return x+1

def test_loop(n):
a = 0
for i in range(n):
a += f(i)
return a

def test_unrolled(n):
a = 0
for i in range(n//4):
a += f(4*i)
a += f(4*i+1)
a += f(4*i+2)
a += f(4*i+3)
return a

from timeit import Timer
n = 10000000
assert test_loop(n) == test_unrolled(n)
t1 = Timer('test_loop(n)',
'from __main__ import test_loop; n=%d' % n)
t2 = Timer('test_unrolled(n)',
'from __main__ import test_unrolled; n=%d' % n)

And here are the results using Python 3.1. Times are the best of 5 for 10
calls each to the loop_* functions, results given in seconds:

>>> min(t1.repeat(number=10, repeat=5))/10
5.97409288883209
>>> min(t2.repeat(number=10, repeat=5))/10
8.25656900405883

I repeated it with a larger loop variable. Since the time per test was so
large (over ten minutes per call on my machine!), I didn't see the need
to run multiple trials:

n *= 100
assert test_loop(n) == test_unrolled(n)
t3 = Timer('test_loop(n)',
'from __main__ import test_loop; n=%d' % n)
t4 = Timer('test_unrolled(n)',
'from __main__ import test_unrolled; n=%d' % n)

And the results:

>>> t3.timeit(number=1)
653.3572518825531
>>> t4.timeit(number=1)
864.6454808712006

That's slightly better (32% slower instead of 37% slower), but still a
massive performance hit. Given these results, I'm prepared to say that
loop unrolling in Python is almost certainly going to be a pessimation,
not an optimization, no matter what you have inside the loop.

--
Steven

```