
I have been reviewing the compile.c module with respect to the use of JUMP_IF_XXX opcodes, and the frequency with which these opcodes are followed by a POP_TOP instruction. It seems to me that there are two kinds of uses cases for these opcodes, The first use case could be expressed as POP_THEN_JUMP_IF_XXXX The second use case could be expressed as JUMP_IF_XXX_ELSE_POP Listed below are the use cases for these instructions, and the functions in compile.c that they apear in. The form is JUMP_IF_XXX(top-of-stack-if-no-jump, top-of-stack-if-jump) com_assert_stmt - JUMP_IF_TRUE(-,-) com_if_stmt - JUMP_IF_FALSE(-,-) com_while_stmt - JUMP_IF_FALSE(-,-) com_try_except - JUMP_IF_FALSE(-,-) com_list_if - JUMP_IF_FALSE(-, -) com_comparison - JUMP_IF_FALSE(-, 0) com_and_test - JUMP_IF_FALSE(-, 0) com_test - JUMP_IF_TRUE(-, 1) Below is a minimally intrusive implementation of the expansion of JUMP_IF_FALSE into two opcodes for handling the two use cases. case JUMP_IF_FALSE_ELSE_POP: case POP_THEN_JUMP_IF_FALSE: NEXTARG(oparg); err = PyObject_IsTrue(TOP()); if (err > 0) { err = 0; POP(); } else if (err == 0) { if (opcode == POP_THEN_JUMP_IF_FALSE) POP(); JUMPBY(oparg); } else break; continue; Comments, suggestions, etc, appreciated.

On Mon, Mar 03, 2003 at 06:55:47PM -0500, Damien Morton wrote:
I think you won't get much of a benefit by adding the 2+ instructions necessary for this scheme. I think it would be best to have JUMP_IF_XXX always do a POP_TOP and never jump to a jump. Below is an example of some code and the disassembly. >>> def f(a, b): ... if a and b: ... print 'nope' ... >>> dis.dis(f) 2 0 LOAD_FAST 0 (a) 3 JUMP_IF_FALSE 4 (to 10) 6 POP_TOP 7 LOAD_FAST 1 (b) >> 10 JUMP_IF_FALSE 9 (to 22) 13 POP_TOP 3 14 LOAD_CONST 1 ('no') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP_FORWARD 1 (to 23) >> 22 POP_TOP >> 23 LOAD_CONST 0 (None) 26 RETURN_VALUE Note the first JUMP_IF_FALSE jumps to the second JUMP_IF_FALSE which then jumps to POP_TOP. An optimized version of this code where the POP is performed as part of the JUMP_IF_XXX could be: >>> dis.dis(f) 2 0 LOAD_FAST 0 (a) 3 JUMP_IF_FALSE 11 (to 17) 6 LOAD_FAST 1 (b) >> 9 JUMP_IF_FALSE 5 (to 17) 3 12 LOAD_CONST 1 ('no') 15 PRINT_ITEM 16 PRINT_NEWLINE >> 17 LOAD_CONST 0 (None) 20 RETURN_VALUE In the optimized version, there are at least 2 less iterations around the eval_frame loop (when a is false). 1 POP_TOP, 1 JUMP_IF_FALSE. If both a and b are true, the if body is executed and there are 3 iterations less. 2 POP_TOPs, 1 JUMP_FORWARD. With more conditions, the savings should be better. The problem is that it's difficult to get the compiler to output this code AFAIK. I believe Skip's peephole optimizer did the transformation to prevent a jump to a jump, but that was another pass. The new compiler Jeremy is working on should make these sorts of transformations easier. All that said, the scheme you propose could provide a decent speed up. The only way to know is to try. :-) Neal

On Mon, Mar 03, 2003 at 06:55:47PM -0500, Damien Morton wrote:
I think you won't get much of a benefit by adding the 2+ instructions necessary for this scheme. I think it would be best to have JUMP_IF_XXX always do a POP_TOP and never jump to a jump. Below is an example of some code and the disassembly. >>> def f(a, b): ... if a and b: ... print 'nope' ... >>> dis.dis(f) 2 0 LOAD_FAST 0 (a) 3 JUMP_IF_FALSE 4 (to 10) 6 POP_TOP 7 LOAD_FAST 1 (b) >> 10 JUMP_IF_FALSE 9 (to 22) 13 POP_TOP 3 14 LOAD_CONST 1 ('no') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP_FORWARD 1 (to 23) >> 22 POP_TOP >> 23 LOAD_CONST 0 (None) 26 RETURN_VALUE Note the first JUMP_IF_FALSE jumps to the second JUMP_IF_FALSE which then jumps to POP_TOP. An optimized version of this code where the POP is performed as part of the JUMP_IF_XXX could be: >>> dis.dis(f) 2 0 LOAD_FAST 0 (a) 3 JUMP_IF_FALSE 11 (to 17) 6 LOAD_FAST 1 (b) >> 9 JUMP_IF_FALSE 5 (to 17) 3 12 LOAD_CONST 1 ('no') 15 PRINT_ITEM 16 PRINT_NEWLINE >> 17 LOAD_CONST 0 (None) 20 RETURN_VALUE In the optimized version, there are at least 2 less iterations around the eval_frame loop (when a is false). 1 POP_TOP, 1 JUMP_IF_FALSE. If both a and b are true, the if body is executed and there are 3 iterations less. 2 POP_TOPs, 1 JUMP_FORWARD. With more conditions, the savings should be better. The problem is that it's difficult to get the compiler to output this code AFAIK. I believe Skip's peephole optimizer did the transformation to prevent a jump to a jump, but that was another pass. The new compiler Jeremy is working on should make these sorts of transformations easier. All that said, the scheme you propose could provide a decent speed up. The only way to know is to try. :-) Neal
participants (2)
-
Damien Morton
-
Neal Norwitz