Speed-up for loops
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sun Sep 5 11:15:49 EDT 2010
On Sun, 05 Sep 2010 12:28:47 +0100, BartC wrote:
>> Getting the above kind of code fast requires the interpreter to be
>> clever enough so that it will use native machine operations on a int
>> type instead of converting back and forth between internal
>> representations.
>
> Writing for i in xrange(1000000000) you'd think would give it a clue,
> but it makes no difference.
CPython takes a very conservative view towards runtime optimizations.
Optimizations don't happen for free, you know, they have costs. Memory is
one cost, but human time and effort is another.
But if you want a JIT compiler, see Psycho, or try PyPy, which is very
exciting and I hope will one day be ready to take over from CPython as
the first choice for production use.
[...]
> One order of magnitude (say 10-20x slower) wouldn't be so bad. That's
> what you might expect for a dynamically typed, interpreted language.
>
> But on my machine this code was more like 50-200x slower than C, for
> unaccelerated Python.
I'd say that 50-200 times slower than C is exactly what I'd expect from a
dynamically typed language like Python without any fancy JIT tricks.
Getting such a language to within an order of magnitude of C is quite an
achievement.
>> Generally, you use matlab's vectorized operations, and in that case,
>> numpy gives you similar performances (sometimes faster, sometimes
>> slower, but in the same ballpark in general).
>
> That would simply be delegating Python to a scripting language.
That's sheer unadulterated nonsense.
In any case, Python was designed as a glue language, specifically to be a
high-level user-friendly language for gluing components written in C
together. That's what Python *is* -- it provides a bunch of primitives,
written in C (or Java, or dot-Net, pick whichever implementation you
like) and manipulated in a friendly, safe language. Calling numpy for
fast vectorized operations is *exactly* the right solution, if you need
high-performance maths calculations.
Use the right tool for the job, don't insist that your spanner should
double as a screwdriver.
> It would
> be nice if you could directly code low-level algorithms in it without
> relying on accelerators, and not have to wait two and a half minutes (or
> whatever) for a simple test to complete.
Yes, and it would be nice if my clothes washed and ironed themselves, but
they don't.
Somebody has to do the work. Are you volunteering to write the JIT
compiler for CPython? Will you contribute to the PyPy project, or help
maintain Psycho, or are you just bitching?
The simple fact is that there are far more important things for Python
developers to spend their time and effort on than optimizations like this.
If such an optimization comes out of the PyPy project, I'll be cheering
them on -- but it's a lot of effort for such a trivial gain.
The example given by the Original Poster is contrived. Nobody sensibly
writes an integer multiplication as a repeated addition like that, and
any real code that would need to be put in a for loop is almost certainly
going to be too complicated for the JIT compiler to benefit greatly. The
loop overhead itself will almost certainly be overwhelmed by the work
done in the loop:
[steve at sylar ~]$ time python -c "a = 0
> for i in xrange(10000000):
> a += 10
> "
real 0m6.906s
user 0m5.820s
sys 0m0.022s
which is about double the time for an empty loop of the same size.
--
Steven
More information about the Python-list
mailing list