[Python-ideas] Disable all peephole optimizations

Wed May 21 13:05:49 CEST 2014

** The problem

A long-standing problem with CPython is that the peephole optimizer 
cannot be completely disabled.  Normally, peephole optimization is a 
good thing, it improves execution speed.  But in some situations, like 
coverage testing, it's more important to be able to reason about the 
code's execution.  I propose that we add a way to completely disable the 
optimizer.

To demonstrate the problem, here is continue.py:

    a = b = c = 0
    for n in range(100):
         if n % 2:
             if n % 4:
                 a += 1
             continue
         else:
             b += 1
         c += 1
    assert a == 50 and b == 50 and c == 50

If you execute "python3.4 -m trace -c -m continue.py", it produces this 
continue.cover file:

         1: a = b = c = 0
       101: for n in range(100):
       100:     if n % 2:
        50:         if n % 4:
        50:             a += 1
     >>>>>>         continue
                else:
        50:         b += 1
        50:     c += 1
         1: assert a == 50 and b == 50 and c == 50

This indicates that the continue line is not executed.  It's true: the 
byte code for that statement is not executed, because the peephole 
optimizer has removed the jump to the jump.  But in reasoning about the 
code, the continue statement is clearly part of the semantics of this 
program.  If you remove the statement, the program will run 
differently.  If you had to explain this code to a learner, you would of 
course describe the continue statement as part of the execution.  So the 
trace output does not match our (correct) understanding of the program.

The reason we are running trace (or coverage.py) in the first place is 
to learn something about our code, but it is misleading us. The peephole 
optimizer is interfering with our ability to reason about the code.  We 
need a way to disable the optimizer so that this won't happen.  This 
type of control is well-known in C compilers, for the same reasons: when 
running code, optimization is good for speed; when reasoning about code, 
optimization gets in the way.

More details are in http://bugs.python.org/issue2506, which also 
includes previous discussion of the idea.

This has come up on Python-Dev, and Guido seemed supportive: 
https://mail.python.org/pipermail/python-dev/2012-December/123099.html .

** Implementation

Although it may seem like a big change to be able to disable the 
optimizer, the heart of it is quite simple.  In compile.c is the only 
call to PyCode_Optimize.  That function takes a string of bytecode and 
returns another.  If we skip that call, the peephole optimizer is disabled.

** User Interface

Unfortunately, the -O command-line switch does not lend itself to a new 
value that means, "less optimization than the default."  I propose a new 
switch -P, to control the peephole optimizer, with a value of -P0 
meaning no optimization at all.  The PYTHONPEEPHOLE environment variable 
would also control the option.

There are about a dozen places internal to CPython where optimization 
level is indicated with an integer, for example, in 
Py_CompileStringObject.  Those uses also don't allow for new values 
indicating less optimization than the default: 0 and -1 already have 
meanings.  Unless we want to start using -2 for less that the default.  
I'm not sure we need to provide for those values, or if the 
PYTHONPEEPHOLE environment variable provides enough control.

** Ramifications

This switch makes no changes to the semantics of Python programs, 
although clearly, if you are tracing a program, the exact sequence of 
lines and bytecodes will be different (this is the whole point).

In the ticket, one objection raised is that providing this option will 
complicate testing, and that optimization is a difficult enough thing to 
get right as it is.  I disagree, I think providing this option will help 
test the optimizer, because it will give us a way to test that code runs 
the same with and without the optimizer. This gives us a tool to use to 
demonstrate that the optimizer isn't changing the behavior of programs.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140521/4cc15b8b/attachment.html>