[Python-Dev] [ANN] superinstructions (VPython 0.1)
steve at holdenweb.com
Thu Oct 23 17:44:03 CEST 2008
Antoine Pitrou wrote:
> J. Sievers <cadr4u <at> gmail.com> writes:
>> A sequence of code such as LOAD_CONST LOAD_FAST BINARY_ADD will, in
>> CPython, push some constant onto the stack, push some local onto the
>> stack, then pop both off the stack, add them and push the result back
>> onto the stack.
>> Turning this into a superinstruction means inlining LOAD_CONST and
>> LOAD_FAST, modifying them to store the values they'd otherwise push
>> onto the stack in local variables and adding a version of BINARY_ADD
>> which reads its arguments from those local variables rather than the
>> stack (this reduces dispatch time in addition to pops and pushes).
> The problem is that this only optimizes code like "x + 1" but not "1 + x" or "x
> + y". To make this generic a first step would be to try to fuse LOAD_CONST and
> LOAD_FAST into a single opcode (and check it doesn't slow down the VM). This
> could be possible by copying the constants table into the start of the frame's
> variables array when the frame is created, so that the LOAD_FAST code still does
> a single indexed array dereference. Since constants are constants, they don't
> need to be copied again when the frame is re-used by a subsequent call of the
> same function (but this would slow done recursive functions a bit, since those
> have to create new frames each time they are called).
Though it would seem redundant to create multiple copies of constant
structures. Wouldn't there be some way to optimize this to allow each
call to access the data from the same place?
> Then fusing e.g. LOAD_FAST LOAD_FAST BINARY_ADD into ADD_FAST_FAST would cover
> many more cases than the optimization you are writing about, without any
> explosion in the number of opcodes.
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
More information about the Python-Dev