Speed-up for loops
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Wed Sep 8 09:13:09 EDT 2010
On Tue, 07 Sep 2010 11:00:03 +0100, BartC wrote:
>> for i in xrange(100000000):
>> a = a + f(i)
>>
>> then unrolling the loop is even less useful. The overhead of the loop
>> itself is likely to be trivial compared to the cost of calling f() 100
>> million times -- the added complexity to shave 3 seconds off a four
>> minute calculation simply isn't worth it.
>
> With Python 3 and def f(x): return x+1, unrolling this loop 4x improved
> speed by 15%; 4.00 minutes reduces to 3.30 minutes.
I'm afraid that I can't replicate those figures. In my test, unrolling
the loop causes a massive SLOWDOWN of 37%, not a speed up. Here is my
test code:
def f(x):
return x+1
def test_loop(n):
a = 0
for i in range(n):
a += f(i)
return a
def test_unrolled(n):
a = 0
for i in range(n//4):
a += f(4*i)
a += f(4*i+1)
a += f(4*i+2)
a += f(4*i+3)
return a
from timeit import Timer
n = 10000000
assert test_loop(n) == test_unrolled(n)
t1 = Timer('test_loop(n)',
'from __main__ import test_loop; n=%d' % n)
t2 = Timer('test_unrolled(n)',
'from __main__ import test_unrolled; n=%d' % n)
And here are the results using Python 3.1. Times are the best of 5 for 10
calls each to the loop_* functions, results given in seconds:
>>> min(t1.repeat(number=10, repeat=5))/10
5.97409288883209
>>> min(t2.repeat(number=10, repeat=5))/10
8.25656900405883
I repeated it with a larger loop variable. Since the time per test was so
large (over ten minutes per call on my machine!), I didn't see the need
to run multiple trials:
n *= 100
assert test_loop(n) == test_unrolled(n)
t3 = Timer('test_loop(n)',
'from __main__ import test_loop; n=%d' % n)
t4 = Timer('test_unrolled(n)',
'from __main__ import test_unrolled; n=%d' % n)
And the results:
>>> t3.timeit(number=1)
653.3572518825531
>>> t4.timeit(number=1)
864.6454808712006
That's slightly better (32% slower instead of 37% slower), but still a
massive performance hit. Given these results, I'm prepared to say that
loop unrolling in Python is almost certainly going to be a pessimation,
not an optimization, no matter what you have inside the loop.
--
Steven
More information about the Python-list
mailing list