[Python-Dev] new bytecode results

Damien Morton newsgroups1@bitfurnace.com
Thu, 27 Feb 2003 13:54:05 -0500


I tried adding a variety of new instructions to the PVM, initially with a
code compression goal for the bytecodes, and later with a performance goal.

definitions:
USING_LOAD_FAST_N
    accesses to locals with an index<16 using a one byte instruction (no
oparg)
USING_LOAD_CONST_N
    accesses to consts with an index<16 using a one byte instruction (no
oparg)
USING_STORE_FAST_N
    accesses to locals with an index<16 using a one byte instruction (no
oparg)
USING_SHORT_CMP
    compare ops using a one byte instruction (no oparg)

PyStone score for best of 10 runs.

umodified 2.3a2 22200

using enum, 22200 (compacting the opcode numeric space using an enum instead
of #defines)

USING_LOAD_FAST_N 22700
USING_LOAD_CONST_N 22400
USING_STORE_FAST_N 22400
USING_LOAD_FAST_N, USING_LOAD_CONST_N 22350
USING_LOAD_FAST_N, USING_STORE_FAST_N, 22000
USING_LOAD_FAST_N, USING_LOAD_CONST_N, USING_STORE_FAST_N 22200

USING_SHORT_CMP 21500

USING_LOAD_FAST_N, USING_LOAD_CONST_N, USING_STORE_FAST_N, USING_SHORT_CMP
22000


Conclusions:

While reducing the size of compiled bytecodes by about 1%, the proposed
modifications at best increase performance by 2%, and at worst reduce
performance by 3%.

Enabling all of the proposed opcodes results in a 1% performance loss.

In general, it would seem that adding opcodes in bulk, even if many opcodes
switch to the same labels, results in a minor performance loss.

Running PyStone under windows results in a fairly large variation in
results. A zip file containing the source files I modified can be found at
http://www.bitfurnace.com/python/modified-source.zip.

If someone would like to try this code on their systems, I would be grateful
to know what kind of results they achieve.

The various proposed opcodes are controlled by a set of #defines in the file
opcode.h


Next steps:

The results of my static analysis indicate that the indices used on
LOAD_FAST, LOAD_CONST, STORE_FAST are almost always small. There may be some
benefit to optimising these instructions to use single byte opargs.

The results of my static and dynamic analysis indicate that the (COMPARE_OP,
JUMP_IF_FALSE, POP_TOP) pattern is highly used. Im looking at what changes
would need to be made to the compiler to remove the need for this sequence
of instructions.