Speed-up for loops

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun Sep 5 17:15:49 CEST 2010

On Sun, 05 Sep 2010 12:28:47 +0100, BartC wrote:

>> Getting the above kind of code fast requires the interpreter to be
>> clever enough so that it will use native machine operations on a int
>> type instead of converting back and forth between internal
>> representations.
> Writing for i in xrange(1000000000) you'd think would give it a clue,
> but it makes no difference.

CPython takes a very conservative view towards runtime optimizations. 
Optimizations don't happen for free, you know, they have costs. Memory is 
one cost, but human time and effort is another.

But if you want a JIT compiler, see Psycho, or try PyPy, which is very 
exciting and I hope will one day be ready to take over from CPython as 
the first choice for production use.

> One order of magnitude (say 10-20x slower) wouldn't be so bad. That's
> what you might expect for a dynamically typed, interpreted language.
> But on my machine this code was more like 50-200x slower than C, for
> unaccelerated Python.

I'd say that 50-200 times slower than C is exactly what I'd expect from a 
dynamically typed language like Python without any fancy JIT tricks. 
Getting such a language to within an order of magnitude of C is quite an 

>> Generally, you use matlab's vectorized operations, and in that case,
>> numpy gives you similar performances (sometimes faster, sometimes
>> slower, but in the same ballpark in general).
> That would simply be delegating Python to a scripting language. 

That's sheer unadulterated nonsense.

In any case, Python was designed as a glue language, specifically to be a 
high-level user-friendly language for gluing components written in C 
together. That's what Python *is* -- it provides a bunch of primitives, 
written in C (or Java, or dot-Net, pick whichever implementation you 
like) and manipulated in a friendly, safe language. Calling numpy for 
fast vectorized operations is *exactly* the right solution, if you need 
high-performance maths calculations.

Use the right tool for the job, don't insist that your spanner should 
double as a screwdriver.

> It would
> be nice if you could directly code low-level algorithms in it without
> relying on accelerators, and not have to wait two and a half minutes (or
> whatever) for a simple test to complete.

Yes, and it would be nice if my clothes washed and ironed themselves, but 
they don't.

Somebody has to do the work. Are you volunteering to write the JIT 
compiler for CPython? Will you contribute to the PyPy project, or help 
maintain Psycho, or are you just bitching?

The simple fact is that there are far more important things for Python 
developers to spend their time and effort on than optimizations like this.

If such an optimization comes out of the PyPy project, I'll be cheering 
them on -- but it's a lot of effort for such a trivial gain.

The example given by the Original Poster is contrived. Nobody sensibly 
writes an integer multiplication as a repeated addition like that, and 
any real code that would need to be put in a for loop is almost certainly 
going to be too complicated for the JIT compiler to benefit greatly. The 
loop overhead itself will almost certainly be overwhelmed by the work 
done in the loop:

[steve at sylar ~]$ time python -c "a = 0
> for i in xrange(10000000):
>     a += 10
> "

real    0m6.906s
user    0m5.820s
sys     0m0.022s

which is about double the time for an empty loop of the same size.


More information about the Python-list mailing list