[Python-Dev] Re: Python 2.1 slower than 2.0

Tim Peters tim.one@home.com
Mon, 29 Jan 2001 22:57:07 -0500


Note that optimizing compilers use a pile of linear-time heuristics to
attempt to solve exponential-time optimization problems (from optimal
register assignment to optimal instruction scheduling, they're all formally
intractable even in isolation).

When code gets non-trivial, not even a compiler's chief designer can
reliably outguess what optimization may do.  It's really not unusual for a
higher optimization level to yield slower code, and especially not when the
source code is pushing or exceeding machine limits (# of registers, # of
instruction pipes, size of branch-prediction buffers; I-cache structure;
dynamic restrictions on execution units; ...).

[Jeremy]
> ...
> One of the differences between -O2 and -O3, according to the man page,
> is that -O3 will perform optimizations that involve a space-speed
> tradeoff.  It also include -finline-functions.  I can imagine that
> some of these optimizations hurt memory performance enough to make a
> difference.

One of the time-consuming ongoing tasks at my last employer was running
profiles and using them to override counterproductive compiler inlining
decisions (in both directions).  It's not just memory that excessive
inlining can screw up, but also things like running out of registers and so
inserting gobs of register spill/restore code, and inlining so much code
that the instruction scheduler effectively gives up (under many compilers, a
sure sign of this is when you look at the generated code for a function, and
it looks beautiful "at the top" but terrible "at the bottom"; some clever
optimizers tried to get around that by optimizing "bottom-up", and then it
looks beautiful at the bottom but terrible at the top <0.5 wink>; others
work middle-out or burn the candle at both ends, with visible consequences
you should be able to recognize now!).

optimization-is-easier-than-speech-recog-but-the-latter-doesn't-work-
    all-that-well-either-ly y'rs  - tim