[Tutor] Efficiency of while versus (x)range
Steven D'Aprano
steve at pearwood.info
Fri Mar 18 06:43:57 CET 2011
Shane O'Connor wrote:
> Hi,
>
> First-time poster here. I've a question about loop efficiency - I was
> wondering whether this code:
>
> i = 0
> while i < 1000:
> do something
> i+=1
>
> is more efficient than:
>
> for i in range(1000):
> do something
>
> or:
>
> for i in xrange(1000):
> do something
You can test this with the timeit module. xrange is about 50% faster
than range for a million items, due to the overhead of building the full
list:
[steve at sylar ~]$ python -m timeit "for n in range(1000000): pass"
10 loops, best of 3: 156 msec per loop
[steve at sylar ~]$ python -m timeit "for n in xrange(1000000): pass"
10 loops, best of 3: 106 msec per loop
(The "10 loops" part means that timeit repeats the entire code snippet
ten times, in order to minimize the effect of other processes and
background tasks.)
However, if you move building the list into setup code, which doesn't
get timed, you will see that there's hardly any difference in time for
iterating over a list and an xrange object:
[steve at sylar ~]$ python -m timeit -s "items = range(1000000)" "for n in
items: pass"
10 loops, best of 3: 96.1 msec per loop
[steve at sylar ~]$ python -m timeit -s "items = xrange(1000000)" "for n in
items: pass"
10 loops, best of 3: 99.3 msec per loop
As for the while loop, it doesn't even come close!
[steve at sylar ~]$ python -m timeit "n = 0
> while n < 1000000: n += 1"
10 loops, best of 3: 282 msec per loop
The problem is that in a while loop, the looping itself is handled in
slow Python code, while in the for loop, it's handled by fast C code (or
whatever language the Python virtual machine is written in).
We can see this by disassembling the code into virtual machine instructions:
[steve at sylar ~]$ python
Python 2.5 (r25:51908, Nov 6 2007, 16:54:01)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> from dis import dis
>>> dis(compile("for x in xrange(1000000): pass", "", "exec"))
1 0 SETUP_LOOP 20 (to 23)
3 LOAD_NAME 0 (xrange)
6 LOAD_CONST 0 (1000000)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 6 (to 22)
16 STORE_NAME 1 (x)
19 JUMP_ABSOLUTE 13
>> 22 POP_BLOCK
>> 23 LOAD_CONST 1 (None)
26 RETURN_VALUE
The for-loop is three virtual instructions: FOR_ITER, STORE_NAME and
JUMP_ABSOLUTE. Compare that to the while loop:
>>> dis(compile("n = 0\nwhile n < 1000000: n += 1", "", "exec"))
1 0 LOAD_CONST 0 (0)
3 STORE_NAME 0 (n)
2 6 SETUP_LOOP 28 (to 37)
>> 9 LOAD_NAME 0 (n)
12 LOAD_CONST 1 (1000000)
15 COMPARE_OP 0 (<)
18 JUMP_IF_FALSE 14 (to 35)
21 POP_TOP
22 LOAD_NAME 0 (n)
25 LOAD_CONST 2 (1)
28 INPLACE_ADD
29 STORE_NAME 0 (n)
32 JUMP_ABSOLUTE 9
>> 35 POP_TOP
36 POP_BLOCK
>> 37 LOAD_CONST 3 (None)
40 RETURN_VALUE
So the for loop pushes more of the actual effort into the underlying
engine, which is faster.
Of course, in real-life code, most of this is just theoretical. I see
from your comment below that you understand this, but for the benefit of
others reading, most of the time, the overhead of the loop body will be
much bigger than the loop itself. Who cares whether a loop takes 30
seconds + 100 milliseconds, or 30 seconds + 150 milliseconds? You aren't
going to notice the difference.
The general advice is best summed up with a quote from the famous
computer scientist Donald Knuth:
"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil. Yet we should not pass
up our opportunities in that critical 3%.A good programmer will not be
lulled into complacency by such reasoning, he will be wise to look
carefully at the critical code; but only after that code has been
identified."
http://en.wikipedia.org/wiki/Program_optimization
Other quotes of note:
"More computing sins are committed in the name of efficiency (without
necessarily achieving it) than for any other single reason - including
blind stupidity." - W.A. Wulf
"The First Rule of Program Optimization: Don't do it. The Second Rule of
Program Optimization (for experts only!): Don't do it yet." - Michael A.
Jackson
> In my mind, the while loop should not allocate as much memory as range or
> have the overhead of the iterator of xrange (although aesthetically, I
> prefer the x/range style). Is this the case or does the compiler do
> something clever here?
As a general rule, Python's compiler rarely does anything clever.
"Clever" optimizations are remarkable for the number of difficult to
notice and even more difficult to fix bugs they introduce.
Nevertheless, the PyPy project is experimenting with optimizing
just-in-time compilers, and so far are already running about twice as
fast as the standard CPython compiler. Similarly, the experimental
WPython compiler also does a lot more optimizations than CPython.
If you're interested in optimization, PyPy is good to look at: for a
handful of carefully selected examples, Python code in PyPy is faster
than optimized C code!
http://morepypy.blogspot.com/2011/02/pypy-faster-than-c-on-carefully-crafted.html
> In particular, I'm using Python 2.4.3 on a web server which needs to run as
> fast as possible using as little memory as possible (no surprises there!).
> I'm aware that there are more significant optimizations than the above and I
> will profile the code rather than prematurely optimize loops at the sake of
> readability/errors but I'm still curious about the answer.
--
Steven
More information about the Tutor
mailing list