[Python-Dev] Re: new bytecode results

Damien Morton newsgroups1@bitfurnace.com
Thu, 27 Feb 2003 17:22:45 -0500


"M.-A. Lemburg" <mal@lemburg.com> wrote
> Damien Morton wrote:
> > Conclusions:
> >
> > While reducing the size of compiled bytecodes by about 1%, the proposed
> > modifications at best increase performance by 2%, and at worst reduce
> > performance by 3%.
> >
> > Enabling all of the proposed opcodes results in a 1% performance loss.
> >
> > In general, it would seem that adding opcodes in bulk, even if many
opcodes
> > switch to the same labels, results in a minor performance loss.
>
> The general problem with the ceval switch statement is that it
> is too big. Adding new opcodes will only make it bigger, so I doubt
> that much can be gained in general by trying to come up with new
> do-everything-in-one-opcode cases.

Each of the LOAD_FAST_N, LOAD_CONST_N, etc opcodes I added contributed only
1 line of code to the inner loop.

LOAD_FAST_0:
   ...
LOAD_FAST_15:
    oparg = opcode - LOAD_FAST_0
LOAD_FAST:
    <body of load_fast>
    break

It is the growth of the switch jumptable that I suspect caused the slowdown.

Im told that, under MSVC, the switch is implemented as two tables - the
first a table of bytes, and the second a table of addresses.

If thats the case, adding non-code-bearing opcodes that direct to the same
label should only increase the switch jumptables by 1 byte for each opcode
added. If not, then the switch jumptables would increase in size by 4 bytes
or each opcode added.

Either way, there doesnt seem to be any advantage in this approach.