[Tutor] List Splicing
Luke Paireepinart
rabidpoobear at gmail.com
Thu Jun 18 07:34:19 CEST 2009
Robert Berman wrote:
> Emille,
>
> Thank you for the example of list splicing. Do you know if this is
> faster than a more conventional loop statement as in my code for
> primearray which is in my original post (reprinted here)
As has been mentioned, you will want to profile your code to know what
is truly faster.
however, the main idea in speeding up your code is to move as much of
the actual operations to C functions as possible.
For example,
x = []
for i in range(10000):
x.append(0)
is going to be quite a bit slower than
x = [0] * 10000
>>> t = Timer("""x = [0] * 1000""")
>>> t.timeit()
6.623384202829115
>>> t = Timer("""x = []
for i in range(1000):
x.append(0)""")
>>> t.timeit()
141.14222554321785
As you can see, the for loop version is about 20 times slower than the
other version.
But they both accomplish the same thing.
The key here is that the first one, using the "[] * var" syntax, does
not iterate over items individually.
This is a gross oversimplification, but think of the python interpreter
like this:
you take your code to a copy shop (a la Kinkos). however, they do not
have any copy machines... instead they will rewrite it by hand, line by
line.
and say they're going to charge you 20 cents per line.
Therefore you'd want to minimize the lines of code to save money, right?
However, in Python, the interpreter isn't executing just line-by-line
(otherwise we'd have no loops)
so what it's essentially seeing in the two examples are, first:
x = [0] * 1000
and for the second example:
x = []
x.append(0)
x.append(0)
x.append(0)
x.append(0)
x.append(0)
..... about 992 more times ...
x.append(0)
x.append(0)
x.append(0)
So you are getting charged to "copy" each of these lines. So what
originally looked like only 2 or 3 more lines of code is in fact many
hundreds more instructions.
This analogy refers to the overhead of the interpreter - just to call
functions and do other basic things, the overhead is much higher than it
would be in a compiled language such as C.
Now you may be thinking "but behind the scenes, x = [0] * 1000 is a lot
of instructions as well!" and you're correct.
However, here's where the key of Python comes in: that "behind the
scenes" stuff is written over a period of years by very experienced
coders, and has been optimized to heck. Also, it's typically going to
be written in C and compiled to native code, which is *a lot* faster
than interpreted Python is.
So basically when you're trying to optimize Python, remember that the
cost of executing each individual line is not that high, for the most
part (of course there are exceptions).
What you want to do is minimize the amount of lines that are interpreted
and maximize the amount of code that is executed by the interpreter's
libraries, which are highly efficient C code. While in another language
you may think "I can do this faster if I do it myself, because the
language's libraries are probably too generic to be faster than a custom
implementation" but if you want to write fast Python, you have to think
"I want to try to do as little as possible most of the time, so I can
transfer most of the computational load over to the libraries and away
from my code."
Just some generic speculation on making stuff more efficient...
More information about the Tutor
mailing list