[New-bugs-announce] [issue44505] loop invariant code motion

zcpara report at bugs.python.org
Thu Jun 24 11:36:41 EDT 2021


New submission from zcpara <zhangchaospecial at gmail.com>:

It is a common scenario to call functions in loop, such as
def foo():
  for i in range(10):
    len("abc")
Currently, we generate bytecode like this;
2           0 SETUP_LOOP              24 (to 26)
              2 LOAD_GLOBAL              0 (range)
              4 LOAD_CONST               1 (10)
              6 CALL_FUNCTION            1
              8 GET_ITER
        >>   10 FOR_ITER                12 (to 24)
             12 STORE_FAST               0 (i)

  3          14 LOAD_GLOBAL              1 (len)
             16 LOAD_CONST               2 ('abc')
             18 CALL_FUNCTION            1
             20 POP_TOP
             22 JUMP_ABSOLUTE           10
        >>   24 POP_BLOCK
        >>   26 LOAD_CONST               0 (None)
             28 RETURN_VALUE
If we can make sure len is a loop invariant, we can generate code as follows. Only one LOAD_GLOBAL len is executed before the loop. LOAD_GLOBAL len is replaced with LOAD_FAST __local__len in the loop.
2           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (10)
              4 CALL_FUNCTION            1
              6 GET_ITER
              8 LOAD_GLOBAL              1 (len)
             10 STORE_FAST               1 (__local__len)
        >>   12 FOR_ITER                 6 (to 26)
             14 STORE_FAST               0 (i)

  3          16 LOAD_FAST                1 (__local__len)
             18 LOAD_CONST               2 ('abc')
             20 CALL_FUNCTION            1
             22 POP_TOP
             24 JUMP_ABSOLUTE            6 (to 12)

  2     >>   26 LOAD_CONST               0 (None)
             28 RETURN_VALU

I prototyped this optimization and gain 4.6% performance improvement on regex_v8 of PyPerformance. Currently, only the global invariants are moved out of the loop. There are some other loop invariants can be moved. Other optimizations such as function inline can provide more opportunities for LICM.
I meet a problem with the following case;
1 def no_pop_blocks():
2    y = 1
3    while not y:
4        bla
5    x = 1

bla is not defined anywhere. With LICM, LOAD_GLOBAL bla will be move out of the loop. Then it executes with an error: bla is not defined.

----------
components: Interpreter Core
messages: 396496
nosy: zhangchaospecial
priority: normal
severity: normal
status: open
title: loop invariant code motion
type: performance
versions: Python 3.11

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue44505>
_______________________________________


More information about the New-bugs-announce mailing list