[Python-Dev] Bytecode idea
Jeremy Hylton
jeremy@zope.com
Wed, 26 Feb 2003 11:55:29 -0500
Chris Tismer wrote:
> Oh, that was not what I meant. I also did this
> two years ago and tossed it. Function calls
> are too expensive.
> What I mean was to fold opcodes by common patterns.
> Unfortunately this is slower, too.
>
> Anyway, I didn't want to get too deep into this.
> Stopping wasting time now :-)
Chris already knows this, but it's worth repeating for people who don't. A
function call isn't always too expensive, it depends on how much work the
opcode is doing. And it depends on lots of other hard-to-predict effects of
the generated code and its interaction with the memory system.
The various function call opcodes regularly call out to separate functions.
I recall benchmarking various options and often moving big chunks of code
out of the mainloop and into functions improved performance slightly.
Except when it didn't <0.3 wink>.
If you are benchmarking various opcode effects, I'd recommend trying to
revive the simple cycle counter instrumentation I did for Python 2.2. The
idea is to use the Pentium cycle counter to measure the number of cycles
spent on each trip through the mainloop. A rough conclusion from the
previous measurements was that trivial opcodes like POP_TOP can execute in
less than 100 cycles, including opcode dispatch. An opcode that involves
calling out to a C function never executes in less than 100 cycles, and
often takes 100s of cycles.
There's a patch floating around sourceforge somewhere.
Jeremy