[Python-Dev] JUMP_IF_X opcodes
Neal Norwitz
neal@metaslash.com
Mon, 03 Mar 2003 21:56:26 -0500
On Mon, Mar 03, 2003 at 06:55:47PM -0500, Damien Morton wrote:
> I have been reviewing the compile.c module with respect to the use of
> JUMP_IF_XXX opcodes, and the frequency with which these opcodes are followed
> by a POP_TOP instruction.
>
> It seems to me that there are two kinds of uses cases for these opcodes,
>
> The first use case could be expressed as POP_THEN_JUMP_IF_XXXX
> The second use case could be expressed as JUMP_IF_XXX_ELSE_POP
>
> Comments, suggestions, etc, appreciated.
I think you won't get much of a benefit by adding the 2+ instructions
necessary for this scheme. I think it would be best to have
JUMP_IF_XXX always do a POP_TOP and never jump to a jump. Below is an
example of some code and the disassembly.
>>> def f(a, b):
... if a and b:
... print 'nope'
...
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
3 JUMP_IF_FALSE 4 (to 10)
6 POP_TOP
7 LOAD_FAST 1 (b)
>> 10 JUMP_IF_FALSE 9 (to 22)
13 POP_TOP
3 14 LOAD_CONST 1 ('no')
17 PRINT_ITEM
18 PRINT_NEWLINE
19 JUMP_FORWARD 1 (to 23)
>> 22 POP_TOP
>> 23 LOAD_CONST 0 (None)
26 RETURN_VALUE
Note the first JUMP_IF_FALSE jumps to the second JUMP_IF_FALSE which
then jumps to POP_TOP. An optimized version of this code where the
POP is performed as part of the JUMP_IF_XXX could be:
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
3 JUMP_IF_FALSE 11 (to 17)
6 LOAD_FAST 1 (b)
>> 9 JUMP_IF_FALSE 5 (to 17)
3 12 LOAD_CONST 1 ('no')
15 PRINT_ITEM
16 PRINT_NEWLINE
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
In the optimized version, there are at least 2 less iterations around
the eval_frame loop (when a is false). 1 POP_TOP, 1 JUMP_IF_FALSE.
If both a and b are true, the if body is executed and there are 3
iterations less. 2 POP_TOPs, 1 JUMP_FORWARD. With more conditions,
the savings should be better.
The problem is that it's difficult to get the compiler to output this
code AFAIK. I believe Skip's peephole optimizer did the
transformation to prevent a jump to a jump, but that was another pass.
The new compiler Jeremy is working on should make these sorts of
transformations easier.
All that said, the scheme you propose could provide a decent speed up.
The only way to know is to try. :-)
Neal