Speed-up for loops

Carl Banks pavlovevidence at gmail.com
Thu Sep 2 16:01:40 EDT 2010


On Sep 2, 5:55 am, Tim Wintle <tim.win... at teamrubber.com> wrote:
> On Thu, 2010-09-02 at 12:02 +0200, Michael Kreim wrote:
> > Hi,
>
> > I was comparing the speed of a simple loop program between Matlab and
> > Python.
> > Unfortunately my Python Code was much slower and I do not understand why.
>
> The main reason is that, under the hood, cpython does something like
> this (in psudo-code)
>
> itterator = xrange(imax)
> while 1:
>   next_attribute = itterator.next
>   try:
>     i = next_attribute()
>   except:
>     break
>   a = a + 10
>
> where C (and I'm assuming matlab) does this:
>
> while 1:
>   i = i + 1
>   if (i > imax):
>     break
>   a = a + 10

Not really.  Someone already posted timings of the while-loop version
in Python and it's much slower than the for loop.  The iterator stuff
is a minor overhead.

The real reason is simple and boring: many languages optimize loops
like this, Python doesn't.

Matlab has a hundred paid engineers who's job is to optimize it, and
its focus is mathematics, so of course they're going to pull out every
stop to get simple loops like the above as fast as possible.


> And the function call in the python is fairly expensive on it's own.
> Plus it has to do all the standard interpreter stuff for memory
> management and deciding when to give up the GIL etc.

Matlab has all that stuff too (it's memory management is much, much
worse than Python's, in fact, but memory management usually doesn't
play into tight loop timings).


> > Are there any ways to speed up the for/xrange loop?
>
> leaving it in python, no. (well, "range" is faster in 2.x, but once you
> get some cache misses due to increased memory usage it's much slower)
>
> avoiding iteration by using list comprehensions can help a lot though as
> it avoids most of the function calls.

List comprehensions use iteration and don't avoid function calls
relative to equivalent for-loops.  I think the main reason they're a
little faster is they can use tighter bytecode.

> If you really need to optimise it then you can convert that module to
> cython by adding a cdef, and then compile it:
>
> cdef int i
> for i in xrange(imax):
>      a = a + 10
> print a
>
> or you can write it in C it'll run a lot faster.

numpy is terrific when you can use it, and I've found that it can do a
lot more than most people expect.  The hard part is figuring out how.

In particular, numpy will trounce Matlab's performance for large
amounts of data, because of the aforementioned memory management
problem.


Carl Banks



More information about the Python-list mailing list