[Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
Wed, 31 Jan 2001 22:00:33 +0100
On Wed, Jan 31, 2001 at 03:34:19PM +0100, M.-A. Lemburg wrote:
> I have made similar experience with -On with n>3 compared to -O2
> using pgcc (gcc optimized for PC processors). BTW, the Linux
> kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
> as CFLAGS -- perhaps Python should too on Linux ?!
Maybe, but the Linux kernel can be quite specific in what version of gcc you
need, and knows in advance on what platform you are using it :) The
stability and actual speedup of gcc's optimization options can and does vary
across platforms. In the above example, -Wall and -Wstrict-prototypes are
just warnings, and -O3 is the same as "-O2 -finline-functions". As for
> Does anybody know about the effect of -fomit-frame-pointer ?
> Would it cause problems or produce code which is not compatible
> with code compiled without this flag ?
The effect of -fomit-frame-pointer is that the compilation of frame-pointer
handling code is avoided. It doesn't have any effect on compatibility, since
it doesn't matter that other parts/functions/libraries do have such code,
but it does make debugging impossible (on most machines, in any case.) From
GCC's info docs:
Don't keep the frame pointer in a register for functions that
don't need one. This avoids the instructions to save, set up and
restore frame pointers; it also makes an extra register available
in many functions. *It also makes debugging impossible on some
On some machines, such as the Vax, this flag has no effect, because
the standard calling sequence automatically handles the frame
pointer and nothing is saved by pretending it doesn't exist. The
machine-description macro =06RAME_POINTER_REQUIRED' controls
whether a target machine supports this flag. *Note Registers::.
Obviously, for the Linux kernel this is a very good thing, you don't debug
the Linux kernel like a normal program anyway (contrary to some other UNIX
kernels, I might add.) I believe -g turns off -fomit-frame-pointer itself,
but the docs for -g or -fomit-frame-pointer don't mention it.=20
One other thing I noted in the gcc docs is that gcc doesn't do loop
unrolling even with -O3, though I thought it would at -O2. You need to add
-funroll-loop to enable loop unrolling, and that might squeeze out some more
performance.. This only works for loops with a fixed repetition, though, so
I'm not sure if it matters.
Thomas Wouters <email@example.com>
Hi! I'm a .signature virus! copy me into your .signature file to help me sp=