Disable all peephole optimizations

** The problem A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled. Normally, peephole optimization is a good thing, it improves execution speed. But in some situations, like coverage testing, it's more important to be able to reason about the code's execution. I propose that we add a way to completely disable the optimizer. To demonstrate the problem, here is continue.py: a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50 If you execute "python3.4 -m trace -c -m continue.py", it produces this continue.cover file: 1: a = b = c = 0 101: for n in range(100): 100: if n % 2: 50: if n % 4: 50: a += 1 >>>>>> continue else: 50: b += 1 50: c += 1 1: assert a == 50 and b == 50 and c == 50 This indicates that the continue line is not executed. It's true: the byte code for that statement is not executed, because the peephole optimizer has removed the jump to the jump. But in reasoning about the code, the continue statement is clearly part of the semantics of this program. If you remove the statement, the program will run differently. If you had to explain this code to a learner, you would of course describe the continue statement as part of the execution. So the trace output does not match our (correct) understanding of the program. The reason we are running trace (or coverage.py) in the first place is to learn something about our code, but it is misleading us. The peephole optimizer is interfering with our ability to reason about the code. We need a way to disable the optimizer so that this won't happen. This type of control is well-known in C compilers, for the same reasons: when running code, optimization is good for speed; when reasoning about code, optimization gets in the way. More details are in http://bugs.python.org/issue2506, which also includes previous discussion of the idea. This has come up on Python-Dev, and Guido seemed supportive: https://mail.python.org/pipermail/python-dev/2012-December/123099.html . ** Implementation Although it may seem like a big change to be able to disable the optimizer, the heart of it is quite simple. In compile.c is the only call to PyCode_Optimize. That function takes a string of bytecode and returns another. If we skip that call, the peephole optimizer is disabled. ** User Interface Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option. There are about a dozen places internal to CPython where optimization level is indicated with an integer, for example, in Py_CompileStringObject. Those uses also don't allow for new values indicating less optimization than the default: 0 and -1 already have meanings. Unless we want to start using -2 for less that the default. I'm not sure we need to provide for those values, or if the PYTHONPEEPHOLE environment variable provides enough control. ** Ramifications This switch makes no changes to the semantics of Python programs, although clearly, if you are tracing a program, the exact sequence of lines and bytecodes will be different (this is the whole point). In the ticket, one objection raised is that providing this option will complicate testing, and that optimization is a difficult enough thing to get right as it is. I disagree, I think providing this option will help test the optimizer, because it will give us a way to test that code runs the same with and without the optimizer. This gives us a tool to use to demonstrate that the optimizer isn't changing the behavior of programs.

On 21 May 2014 21:06, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
** Implementation
Although it may seem like a big change to be able to disable the optimizer, the heart of it is quite simple. In compile.c is the only call to PyCode_Optimize. That function takes a string of bytecode and returns another. If we skip that call, the peephole optimizer is disabled.
** User Interface
Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option.
Since this is a CPython specific thing, a -X named command line option would be more appropriate.
There are about a dozen places internal to CPython where optimization
level is indicated with an integer, for example, in Py_CompileStringObject. Those uses also don't allow for new values indicating less optimization than the default: 0 and -1 already have meanings. Unless we want to start using -2 for less that the default. I'm not sure we need to provide for those values, or if the PYTHONPEEPHOLE environment variable provides enough control. I assume you want the environment variable so the setting can be inherited by subprocesses? Cheers, Nick.

On 5/21/14 7:41 AM, Nick Coghlan wrote:
On 21 May 2014 21:06, "Ned Batchelder" <ned@nedbatchelder.com <mailto:ned@nedbatchelder.com>> wrote:
** Implementation
Although it may seem like a big change to be able to disable the optimizer, the heart of it is quite simple. In compile.c is the only call to PyCode_Optimize. That function takes a string of bytecode and returns another. If we skip that call, the peephole optimizer is disabled.
** User Interface
Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option.
Since this is a CPython specific thing, a -X named command line option would be more appropriate.
I had overlooked the introduction of -X. Yes, that seems like the right way: -Xpeephole=0
There are about a dozen places internal to CPython where
optimization level is indicated with an integer, for example, in Py_CompileStringObject. Those uses also don't allow for new values indicating less optimization than the default: 0 and -1 already have meanings. Unless we want to start using -2 for less that the default. I'm not sure we need to provide for those values, or if the PYTHONPEEPHOLE environment variable provides enough control.
I assume you want the environment variable so the setting can be inherited by subprocesses?
It allows it to be inherited by subprocesses, yes. I was hoping it would mean the setting would be available deeper in the interpreter, but now that I think about it, environment variables are interpreted at the top of the interpreter, and then the settings passed along internally. I'll do a survey to figure out where the setting has to be plumbed through the layers to get to compile.c properly. --Ned.
Cheers, Nick.

On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
** The problem
A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled. Normally, peephole optimization is a good thing, it improves execution speed. But in some situations, like coverage testing, it's more important to be able to reason about the code's execution. I propose that we add a way to completely disable the optimizer.
I'm not sure whether this is an argument for or against your proposal, but the continue statement shown below is *not* dead code and should not be optimized out. The assert fails if you remove the continue statement. I don't have 3.4 on this machine to test with, but using 3.3, I can see no evidence that `continue` is optimized away. Later in your post, you say:
It's true: the byte code for that statement [the continue] is not executed, because the peephole optimizer has removed the jump to the jump.
But that cannot be true, because if it were, the assertion would fail. Here's your code again:
To demonstrate the problem, here is continue.py:
a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50
If the continue were not executed, c would equal 100 and the assertion would fail. Have I misunderstood something? (By the way, as given, your indents are inconsistent: some are 4 spaces and some are 5.) -- Steven

On 21.05.2014 14:13, Steven D'Aprano wrote:
On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
** The problem
A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled. Normally, peephole optimization is a good thing, it improves execution speed. But in some situations, like coverage testing, it's more important to be able to reason about the code's execution. I propose that we add a way to completely disable the optimizer.
I'm not sure whether this is an argument for or against your proposal, but the continue statement shown below is *not* dead code and should not be optimized out. The assert fails if you remove the continue statement.
I don't have 3.4 on this machine to test with, but using 3.3, I can see no evidence that `continue` is optimized away.
The logical continue is still there -- what happens is that the optimizer rewrites the `else` jump at the preceding `if` condition, which would normally point at the `continue` statement, to the beginning of the loop, because it would be a jump (to the continue) to a jump (to the for loop header). Thus, the actual continue statement is not reached, but logically the code does the same, because the only way continue would have been reached was transformed to a continue itself.
Later in your post, you say:
It's true: the byte code for that statement [the continue] is not executed, because the peephole optimizer has removed the jump to the jump.
But that cannot be true, because if it were, the assertion would fail. Here's your code again:
To demonstrate the problem, here is continue.py:
a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50
If the continue were not executed, c would equal 100 and the assertion would fail. Have I misunderstood something?
(By the way, as given, your indents are inconsistent: some are 4 spaces and some are 5.)

On 5/21/14 8:21 AM, Jonas Wielicki wrote:
On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
** The problem
A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled. Normally, peephole optimization is a good thing, it improves execution speed. But in some situations, like coverage testing, it's more important to be able to reason about the code's execution. I propose that we add a way to completely disable the optimizer.
I'm not sure whether this is an argument for or against your proposal, but the continue statement shown below is*not* dead code and should not be optimized out. The assert fails if you remove the continue statement.
I don't have 3.4 on this machine to test with, but using 3.3, I can see no evidence that `continue` is optimized away. The logical continue is still there -- what happens is that the optimizer rewrites the `else` jump at the preceding `if` condition, which would normally point at the `continue` statement, to the beginning of the loop, because it would be a jump (to the continue) to a jump (to
On 21.05.2014 14:13, Steven D'Aprano wrote: the for loop header).
Thus, the actual continue statement is not reached, but logically the code does the same, because the only way continue would have been reached was transformed to a continue itself.
To make the details more explicit, here is the source again, and the disassembled code, with the original source interspersed: a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50 Disassembled (Python 3.4, but the same effect is visible in 2.7, 3.3, etc): a = b = c = 0 1 0 LOAD_CONST 0 (0) 3 DUP_TOP 4 STORE_NAME 0 (a) 7 DUP_TOP 8 STORE_NAME 1 (b) 11 STORE_NAME 2 (c) for n in range(100): 2 14 SETUP_LOOP 79 (to 96) 17 LOAD_NAME 3 (range) 20 LOAD_CONST 1 (100) 23 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 26 GET_ITER >> 27 FOR_ITER 65 (to 95) 30 STORE_NAME 4 (n) if n % 2: 3 33 LOAD_NAME 4 (n) 36 LOAD_CONST 2 (2) 39 BINARY_MODULO 40 POP_JUMP_IF_FALSE 72 if n % 4: 4 43 LOAD_NAME 4 (n) 46 LOAD_CONST 3 (4) 49 BINARY_MODULO 50 POP_JUMP_IF_FALSE 27 a += 1 5 53 LOAD_NAME 0 (a) 56 LOAD_CONST 4 (1) 59 INPLACE_ADD 60 STORE_NAME 0 (a) 63 JUMP_ABSOLUTE 27 continue 6 66 JUMP_ABSOLUTE 27 69 JUMP_FORWARD 10 (to 82) b += 1 8 >> 72 LOAD_NAME 1 (b) 75 LOAD_CONST 4 (1) 78 INPLACE_ADD 79 STORE_NAME 1 (b) c += 1 9 >> 82 LOAD_NAME 2 (c) 85 LOAD_CONST 4 (1) 88 INPLACE_ADD 89 STORE_NAME 2 (c) 92 JUMP_ABSOLUTE 27 >> 95 POP_BLOCK assert a == 50 and b == 50 and c == 50 10 >> 96 LOAD_NAME 0 (a) 99 LOAD_CONST 5 (50) 102 COMPARE_OP 2 (==) 105 POP_JUMP_IF_FALSE 132 108 LOAD_NAME 1 (b) 111 LOAD_CONST 5 (50) 114 COMPARE_OP 2 (==) 117 POP_JUMP_IF_FALSE 132 120 LOAD_NAME 2 (c) 123 LOAD_CONST 5 (50) 126 COMPARE_OP 2 (==) 129 POP_JUMP_IF_TRUE 138 >> 132 LOAD_GLOBAL 5 (AssertionError) 135 RAISE_VARARGS 1 >> 138 LOAD_CONST 6 (None) 141 RETURN_VALUE Notice that line 6 (the continue) is unreachable, because the else-jump from line 4 has been turned into a jump to bytecode offset 27 (the for loop), and the end of line 5 has also been turned into a jump to 27, rather than letting it flow to line 6. So line 6 still exists in the bytecode, but is never executed, leading tracing tools to indicate that line 6 is never executed. --Ned.

On 22 May 2014 00:24, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
On 5/21/14 8:21 AM, Jonas Wielicki wrote:
On 21.05.2014 14:13, Steven D'Aprano wrote:
On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
** The problem
A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled. Normally, peephole optimization is
good thing, it improves execution speed. But in some situations,
coverage testing, it's more important to be able to reason about
code's execution. I propose that we add a way to completely disable the optimizer.
I'm not sure whether this is an argument for or against your
but the continue statement shown below is *not* dead code and should not be optimized out. The assert fails if you remove the continue statement.
I don't have 3.4 on this machine to test with, but using 3.3, I can see no evidence that `continue` is optimized away.
The logical continue is still there -- what happens is that the optimizer rewrites the `else` jump at the preceding `if` condition, which would normally point at the `continue` statement, to the beginning of the loop, because it would be a jump (to the continue) to a jump (to the for loop header).
Thus, the actual continue statement is not reached, but logically the code does the same, because the only way continue would have been reached was transformed to a continue itself.
To make the details more explicit, here is the source again, and the disassembled code, with the original source interspersed:
a = b = c = 0 for n in range(100): if n % 2: if n % 4: a += 1 continue else: b += 1 c += 1 assert a == 50 and b == 50 and c == 50
Disassembled (Python 3.4, but the same effect is visible in 2.7, 3.3, etc):
a = b = c = 0 1 0 LOAD_CONST 0 (0) 3 DUP_TOP 4 STORE_NAME 0 (a) 7 DUP_TOP 8 STORE_NAME 1 (b) 11 STORE_NAME 2 (c)
for n in range(100): 2 14 SETUP_LOOP 79 (to 96) 17 LOAD_NAME 3 (range) 20 LOAD_CONST 1 (100) 23 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 26 GET_ITER >> 27 FOR_ITER 65 (to 95) 30 STORE_NAME 4 (n)
if n % 2: 3 33 LOAD_NAME 4 (n) 36 LOAD_CONST 2 (2) 39 BINARY_MODULO 40 POP_JUMP_IF_FALSE 72
if n % 4: 4 43 LOAD_NAME 4 (n) 46 LOAD_CONST 3 (4) 49 BINARY_MODULO 50 POP_JUMP_IF_FALSE 27
a += 1 5 53 LOAD_NAME 0 (a) 56 LOAD_CONST 4 (1) 59 INPLACE_ADD 60 STORE_NAME 0 (a) 63 JUMP_ABSOLUTE 27
continue 6 66 JUMP_ABSOLUTE 27 69 JUMP_FORWARD 10 (to 82)
b += 1 8 >> 72 LOAD_NAME 1 (b) 75 LOAD_CONST 4 (1) 78 INPLACE_ADD 79 STORE_NAME 1 (b)
c += 1 9 >> 82 LOAD_NAME 2 (c) 85 LOAD_CONST 4 (1) 88 INPLACE_ADD 89 STORE_NAME 2 (c) 92 JUMP_ABSOLUTE 27 >> 95 POP_BLOCK
assert a == 50 and b == 50 and c == 50 10 >> 96 LOAD_NAME 0 (a) 99 LOAD_CONST 5 (50) 102 COMPARE_OP 2 (==) 105 POP_JUMP_IF_FALSE 132 108 LOAD_NAME 1 (b) 111 LOAD_CONST 5 (50) 114 COMPARE_OP 2 (==) 117 POP_JUMP_IF_FALSE 132 120 LOAD_NAME 2 (c) 123 LOAD_CONST 5 (50) 126 COMPARE_OP 2 (==) 129 POP_JUMP_IF_TRUE 138 >> 132 LOAD_GLOBAL 5 (AssertionError) 135 RAISE_VARARGS 1 >> 138 LOAD_CONST 6 (None) 141 RETURN_VALUE
Notice that line 6 (the continue) is unreachable, because the else-jump from line 4 has been turned into a jump to bytecode offset 27 (the for loop), and the end of line 5 has also been turned into a jump to 27, rather
a like the proposal, than letting it flow to line 6. So line 6 still exists in the bytecode, but is never executed, leading tracing tools to indicate that line 6 is never executed. So isn't this just a bug in the dead code elimination? Fixing that (so there's no bytecode behind that line and coverage tools can know it has been optimised out) sounds better than adding an obscure config option. Potentially less risky would be to provide a utility in the dis module to flag such lines after the fact. Cheers, Nick.
--Ned.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 5/21/14 8:07 PM, Nick Coghlan wrote:
Notice that line 6 (the continue) is unreachable, because the else-jump from line 4 has been turned into a jump to bytecode offset 27 (the for loop), and the end of line 5 has also been turned into a jump to 27, rather than letting it flow to line 6. So line 6 still exists in the bytecode, but is never executed, leading tracing tools to indicate that line 6 is never executed.
So isn't this just a bug in the dead code elimination? Fixing that (so there's no bytecode behind that line and coverage tools can know it has been optimised out) sounds better than adding an obscure config option.
Perhaps I don't know how much dead code elimination was intended. Assuming we can get to the point that the statement has been completely removed, you'll still have the confusing state that a perfectly good statement is marked as not executable (because it has no corresponding bytecode). And getting to that point means adding more complexity to the bytecode optimizer.
Potentially less risky would be to provide a utility in the dis module to flag such lines after the fact.
I don't see how the dis module would know which lines these are? I'm surprised at the amount of invention and mystery code people will propose to avoid having an off-switch for the code we already have.
Cheers, Nick.

On 22 May 2014 12:00, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
I'm surprised at the amount of invention and mystery code people will
propose to avoid having an off-switch for the code we already have. It's not the off switch per se, it's the documentation and testing consequences. Better to figure out a way to let the code generator and analysis tools collaborate more effectively than to complicate the execution model further. Cheers, Nick.
Cheers, Nick.

On 5/22/14 4:25 AM, Nick Coghlan wrote:
On 22 May 2014 12:00, "Ned Batchelder" <ned@nedbatchelder.com <mailto:ned@nedbatchelder.com>> wrote:
I'm surprised at the amount of invention and mystery code people
will propose to avoid having an off-switch for the code we already have.
It's not the off switch per se, it's the documentation and testing consequences. Better to figure out a way to let the code generator and analysis tools collaborate more effectively than to complicate the execution model further.
The problem with "letting them collaborate more effectively" is that we don't know how to do that. If we can come up with a way to do it, it will involve much more complex code than I am proposing. As far as documentation, we have three possibilities for optimization level now. This will add a fourth. I don't see that as a burden. On the testing front, if I were the developer of an optimizer, I would welcome a switch to disable it, as a way to test that optimizations don't change semantics. I understand that this is a different mode of execution. I guess we have different opinions about the tradeoff of risk and benefit of that new mode.
Cheers, Nick.

Guido van Rossum wrote:
FWIW, I am strictly with Ned here.
As someone who maintains/develops a debugger for Python, I’m with Ned as well (and also Raymond, since I really don’t want to have to worry about one-more-mode that Python might be running in). Why not move the existing optimisation into -O mode and put future optimisations in there too? It may just start having enough value that people switch to using it.

On 05/22/2014 08:41 AM, Steve Dower wrote:
Guido van Rossum wrote:
FWIW, I am strictly with Ned here.
As someone who maintains/develops a debugger for Python, I’m with Ned as well (and also Raymond, since I really don’t want to have to worry about one-more-mode that Python might be running in).
Why not move the existing optimisation into -O mode and put future optimisations in there too? It may just start having enough value that people switch to using it.
I will admit to being very surprised the day I realized that the normal run mode for python is debugging mode! For anyone who hasn't yet realized this, without -O, __debug__ is True, but with any -O __debug__ is False. Given that, it does seem kind of odd to have source altering optimizations active when __debug__ is True. Of course, we can't change that mid-3.x stream. However, we could turn off optimizations by default, and then have -O remove assertions /and/ turn on optimizations. Which would still work nicely with .pyc and .pyo files as ... wait, let me make a table: flag | optimizations | saved files --------+--------------------+-------------- none | none | none --------+--------------------+-------------- -O | asserts removed | .pyc | peephole, etc. | --------+--------------------+-------------- -OO | -O plus | | docstrings removed | .pyo That would certainly make the -O flags make more sense than they do now. It would also emphasize the fact that assert is not for user data verification. ;) -- ~Ethan~

On Thu, May 22, 2014 at 10:29:48AM -0700, Ethan Furman wrote:
However, we could turn off optimizations by default, and then have -O remove assertions /and/ turn on optimizations.
Which would still work nicely with .pyc and .pyo files as ... wait, let me make a table:
flag | optimizations | saved files --------+--------------------+-------------- none | none | none --------+--------------------+-------------- -O | asserts removed | .pyc | peephole, etc. | --------+--------------------+-------------- -OO | -O plus | | docstrings removed | .pyo
I think we still want to cache byte code in .pyc files by default. Technically, yes, it's an optimization, but it's not the sort of optimization that makes a difference to debugging[1]. As I understand it, generating the parse tree is *extremely* expensive. Run python -v to see just how many modules would have to be parsed and compiled every single time without the cached .pyc files.
That would certainly make the -O flags make more sense than they do now. It would also emphasize the fact that assert is not for user data verification. ;)
:-) [1] Except perhaps under very rare and unusual circumstances, but there are already mechanisms in place to disable the generation of .pyc files. -- Steven

On Thu, May 22, 2014 at 03:41:31PM +0000, Steve Dower wrote:
Why not move the existing optimisation into -O mode and put future optimisations in there too? It may just start having enough value that people switch to using it.
I just had the same idea, you beat me to it. There's a steady but small stream of people asking "why do we have -O, it does so little we might as well get rid of it". If I remember correctly (and apologies if I do not), Guido has even suggested getting rid of simple constant folding. So let's make -O more attractive, while simplifying the default behaviour: * By default, no optimizations operate at all. * With -O, you get assert disabling, the tricky string concatenation optimization, constant folding, and whatever else the peepholer does. * The double -OO switch should be deprecated, for eventual removal in the very distant future. (4.0? 5.0?) * Instead, a separate switch for removing docstrings can be added, to support implementations in low-memory devices or other constrained situations. This will make Python's compilation model a little more familiar to people coming from other languages. It will make -O more attractive, instead of being viewed by some as a waste of effort, and ensure that by default there are no tricks played with byte-code. A big advantage: we already have separate .pyo and .pyc files, so no risk of confusion. Downside of this suggestion: - To the extent that constant folding and other optimizations actually lead to a speed-up, turning them off by default will be a performance regression. - Experienced programmers ought to know not to rely on the string concatenation optimization, as it is non-portable and prone to surprising failures even in CPython. The optimization really only exists for naive programmers, but they are unlikely to know about, or bother using, -O to get that optimization. - Surely I don't expect PyPy to perform no optimizations at all unless the -O switch is given? I'd have to be mad to suggest that. -- Steven

On Thu, May 22, 2014 at 11:59 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 22, 2014 at 03:41:31PM +0000, Steve Dower wrote:
Why not move the existing optimisation into -O mode and put future optimisations in there too? It may just start having enough value that people switch to using it.
I just had the same idea, you beat me to it.
Same here. More concretely: -O0 -- no optmizations at all -O1 -- level 1 optimizations (current peephole optmizations), asserts disabled -O2 -- level 2 optimizations (currently nothing extra) -O3 -- ... -ONs or -X nodocstrings or -X compact or --compact or --nodocstrings -- remove docstrings (for space savings) --debug or -X debug -- sets __debug__ to True (also implies -O0) Compatibility (keeping the current behavior): Default: -O<max> + __debug__ = True (deprecate setting __debug__ to True?) -O -- same as -O<max> -OO -- same as -O<max>s (deprecate) Having the current optimizations correspond to -O1 makes sense in that we don't have anything more granular. However, if more optimizations were added I'd expect them to fall under a higher optimization level. Adding a new option just for docstrings/compact seems like a I waste so I like Stefan's idea of optionally appending "s" (for space) onto the -O option. As Barry noted, we would also build on PEPs 3147/3149 to add a tag for the optmization level, etc. The default mode would keep the current cache tag and -O/-OO would likewise stay the same (with the .pyo suffix).
* The double -OO switch should be deprecated, for eventual removal in the very distant future. (4.0? 5.0?)
Good idea.
* Instead, a separate switch for removing docstrings can be added, to support implementations in low-memory devices or other constrained situations.
Also a good idea.
This will make Python's compilation model a little more familiar to people coming from other languages. It will make -O more attractive, instead of being viewed by some as a waste of effort, and ensure that by default there are no tricks played with byte-code.
+1 -eric

On Thu, May 22, 2014 at 12:49 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Same here. More concretely: ...
Having said that, revamping those options and our current optimization mechanism is a far cry from just adding -X nopeephole as Ned has implied. While the former may make sense on its own, those broader changes may languish as nice-to-haves. It may be better to go with the latter in the short-term while the broader changes swirl in the maelstrom of discussion indefinitely. -eric

On 5/22/14 2:57 PM, Eric Snow wrote:
On Thu, May 22, 2014 at 12:49 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Same here. More concretely: ...
Having said that, revamping those options and our current optimization mechanism is a far cry from just adding -X nopeephole as Ned has implied. While the former may make sense on its own, those broader changes may languish as nice-to-haves. It may be better to go with the latter in the short-term while the broader changes swirl in the maelstrom of discussion indefinitely.
I get distracted (by work...) for the afternoon, and things take an unexpected turn! I definitely did not mean to throw open the floodgates to reconsider the entire -O switch. I agree that the -O switch seems like too much UI for too little change in results, and I think a different set of settings and defaults makes more sense. But I do not suppose that we have much appetite to take on that large a change. For my purposes, an environment variable and no change or addition to the switches would be fine. --Ned
-eric

On 23 May 2014 06:27, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
On 5/22/14 2:57 PM, Eric Snow wrote:
On Thu, May 22, 2014 at 12:49 PM, Eric Snow <ericsnowcurrently@gmail.com>
wrote:
Same here. More concretely:
...
Having said that, revamping those options and our current optimization mechanism is a far cry from just adding -X nopeephole as Ned has implied. While the former may make sense on its own, those broader changes may languish as nice-to-haves. It may be better to go with the latter in the short-term while the broader changes swirl in the maelstrom of discussion indefinitely.
I get distracted (by work...) for the afternoon, and things take an unexpected turn!
I definitely did not mean to throw open the floodgates to reconsider the entire -O switch. I agree that the -O switch seems like too much UI for too little change in results, and I think a different set of settings and defaults makes more sense. But I do not suppose that we have much appetite to take on that large a change.
For my purposes, an environment variable and no change or addition to the switches would be fine.
Given how far away 3.5 is, I'd actually be interested in seeing a full write-up of Eric's proposal, comparing it to the "let's just add some more technical debt to the pile" -X option based approach. I don't think *anyone* really likes the current state of the optimisation flags, so if this proposal tips us over the edge into finally fixing them properly, huzzah! Cheers, Nick.
--Ned
-eric
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

2014-05-23 11:11 GMT+02:00 Nick Coghlan <ncoghlan@gmail.com>:
Given how far away 3.5 is, I'd actually be interested in seeing a full write-up of Eric's proposal, comparing it to the "let's just add some more technical debt to the pile" -X option based approach.
The discussion in now splitted in 4 places: 3 threads on this mailing list, 1 issue in the bug tracker. And there are some old discussions on python-dev. It's maybe time to use the power of the PEP process to summarize this in a clear document? (Write a PEP.) Victor

On 23 May 2014 19:30, Victor Stinner <victor.stinner@gmail.com> wrote:
2014-05-23 11:11 GMT+02:00 Nick Coghlan <ncoghlan@gmail.com>:
Given how far away 3.5 is, I'd actually be interested in seeing a full write-up of Eric's proposal, comparing it to the "let's just add some more technical debt to the pile" -X option based approach.
The discussion in now splitted in 4 places: 3 threads on this mailing list, 1 issue in the bug tracker. And there are some old discussions on python-dev.
It's maybe time to use the power of the PEP process to summarize this in a clear document? (Write a PEP.)
Yes, I think so. One key thing this discussion made me realise is that we haven't taken a serious look at the compilation behaviour since PEP 3147 was implemented. The introduction of the cache tag and the source<->cache conversion functions provides an opportunity to actually clean up the handling of the different optimisation levels, and potentially make docstring stripping an independent setting. It may be that the end result of that process is to declare "-X nopeephole" a good enough solution and proceed with implementing that. I just think it's worth exploring what would be involved in fixing things properly before making a decision. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels. While it may be okay for a developer that their pyc files all get invalidated when they change the optimization level, the stdlib and site-packages may require root access to write, so if your optimization level means you have to ignore the precompiled stdlib or site packages, that would be a major drag on your startup time (and memory usage will also spike at import time, since the AST is rather large). Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.) On Fri, May 23, 2014 at 9:33 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 23 May 2014 19:30, Victor Stinner <victor.stinner@gmail.com> wrote:
2014-05-23 11:11 GMT+02:00 Nick Coghlan <ncoghlan@gmail.com>:
Given how far away 3.5 is, I'd actually be interested in seeing a full write-up of Eric's proposal, comparing it to the "let's just add some more technical debt to the pile" -X option based approach.
The discussion in now splitted in 4 places: 3 threads on this mailing list, 1 issue in the bug tracker. And there are some old discussions on python-dev.
It's maybe time to use the power of the PEP process to summarize this in a clear document? (Write a PEP.)
Yes, I think so. One key thing this discussion made me realise is that we haven't taken a serious look at the compilation behaviour since PEP 3147 was implemented. The introduction of the cache tag and the source<->cache conversion functions provides an opportunity to actually clean up the handling of the different optimisation levels, and potentially make docstring stripping an independent setting.
It may be that the end result of that process is to declare "-X nopeephole" a good enough solution and proceed with implementing that. I just think it's worth exploring what would be involved in fixing things properly before making a decision.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)

On May 23, 2014, at 12:49 PM, Guido van Rossum <guido@python.org> wrote:
I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files.
I agree with this I think.
I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels.
Sadly enough it doesn’t go far enough since you can’t have (as far as I know) a .pyo for both -O and -OO. Perhaps the PEP isn’t the worst idea in order to make all of that work with the __pycache__ directories and the pyc tagging.
While it may be okay for a developer that their pyc files all get invalidated when they change the optimization level, the stdlib and site-packages may require root access to write, so if your optimization level means you have to ignore the precompiled stdlib or site packages, that would be a major drag on your startup time (and memory usage will also spike at import time, since the AST is rather large).
Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.)
On Fri, May 23, 2014 at 9:33 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: On 23 May 2014 19:30, Victor Stinner <victor.stinner@gmail.com> wrote:
2014-05-23 11:11 GMT+02:00 Nick Coghlan <ncoghlan@gmail.com>:
Given how far away 3.5 is, I'd actually be interested in seeing a full write-up of Eric's proposal, comparing it to the "let's just add some more technical debt to the pile" -X option based approach.
The discussion in now splitted in 4 places: 3 threads on this mailing list, 1 issue in the bug tracker. And there are some old discussions on python-dev.
It's maybe time to use the power of the PEP process to summarize this in a clear document? (Write a PEP.)
Yes, I think so. One key thing this discussion made me realise is that we haven't taken a serious look at the compilation behaviour since PEP 3147 was implemented. The introduction of the cache tag and the source<->cache conversion functions provides an opportunity to actually clean up the handling of the different optimisation levels, and potentially make docstring stripping an independent setting.
It may be that the end result of that process is to declare "-X nopeephole" a good enough solution and proceed with implementing that. I just think it's worth exploring what would be involved in fixing things properly before making a decision.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Fri, May 23, 2014 at 10:08 AM, Donald Stufft <donald@stufft.io> wrote:
On May 23, 2014, at 12:49 PM, Guido van Rossum <guido@python.org> wrote:
I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files.
I agree with this I think.
I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels.
Sadly enough it doesn’t go far enough since you can’t have (as far as I know) a .pyo for both -O and -OO. Perhaps the PEP isn’t the worst idea in order to make all of that work with the __pycache__ directories and the pyc tagging.
Agreed (though I think that -OO is a very niche feature) and I think deciding on what to do about this (if anything) should not hold the peephole disabling feature hostage. (The latter of course has to decide what to do about pyc files, but the should be a suitable answer that doesn't require solving the general problem nor prevents the general problem being solved.) -- --Guido van Rossum (python.org/~guido)

On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum <guido@python.org> wrote:
I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files.
Yeah, that's exactly what I was trying to convey in the followup to my longer message about revamping the optimization levels.
Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.)
Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage? -eric

On Fri, May 23, 2014 at 10:17 AM, Eric Snow <ericsnowcurrently@gmail.com>wrote:
On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum <guido@python.org> wrote:
Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.)
Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage?
In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports. -- --Guido van Rossum (python.org/~guido)

On 5/23/14 1:22 PM, Guido van Rossum wrote:
On Fri, May 23, 2014 at 10:17 AM, Eric Snow <ericsnowcurrently@gmail.com <mailto:ericsnowcurrently@gmail.com>> wrote:
On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum <guido@python.org <mailto:guido@python.org>> wrote: > Looking at my own (frequent) use of coverage.py, I would be totally fine if > disabling peephole optimization only affected my app's code, and kept using > the precompiled stdlib. (How exactly this would work is left as an exercise > for the reader.)
Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage?
In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports.
I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables! In any case, it seems that the penalty for avoiding the .pyc files is not burdensome.
-- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 27 May 2014 10:28, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
On 5/23/14 1:22 PM, Guido van Rossum wrote:
On Fri, May 23, 2014 at 10:17 AM, Eric Snow <ericsnowcurrently@gmail.com>
wrote:
On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum <guido@python.org>
wrote:
Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage?
In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports.
I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables!
In any case, it seems that the penalty for avoiding the .pyc files is not burdensome.
Along these lines, how about making the environment variable something like "PYTHONANALYSINGSOURCE" with the effects: - bytecode files are neither read nor written - all bytecode and AST optimisations are disabled A use case oriented flag like that lets us tweak the definition as needed in the future, unlike an option that is specific to turning off the CPython peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it off would still be covered by an "analysing source" flag). Cheers, Nick.
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

- bytecode files are neither read nor written
Yay! That would be amazing... On Mon, May 26, 2014 at 7:40 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 27 May 2014 10:28, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
On 5/23/14 1:22 PM, Guido van Rossum wrote:
On Fri, May 23, 2014 at 10:17 AM, Eric Snow <
ericsnowcurrently@gmail.com> wrote:
On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum <guido@python.org>
wrote:
Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage?
In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports.
I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables!
In any case, it seems that the penalty for avoiding the .pyc files is not burdensome.
Along these lines, how about making the environment variable something like "PYTHONANALYSINGSOURCE" with the effects:
- bytecode files are neither read nor written - all bytecode and AST optimisations are disabled
A use case oriented flag like that lets us tweak the definition as needed in the future, unlike an option that is specific to turning off the CPython peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it off would still be covered by an "analysing source" flag).
Cheers, Nick.
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 5/26/14 10:40 PM, Nick Coghlan wrote:
On 27 May 2014 10:28, "Ned Batchelder" <ned@nedbatchelder.com <mailto:ned@nedbatchelder.com>> wrote:
On 5/23/14 1:22 PM, Guido van Rossum wrote:
On Fri, May 23, 2014 at 10:17 AM, Eric Snow
<ericsnowcurrently@gmail.com <mailto:ericsnowcurrently@gmail.com>> wrote:
On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum
<guido@python.org <mailto:guido@python.org>> wrote:
Would it be a problem if .pyc files weren't generated or used (a la -B or PYTHONDONTWRITEBYTECODE) when you ran coverage?
In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports.
I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables!
In any case, it seems that the penalty for avoiding the .pyc files is not burdensome.
Along these lines, how about making the environment variable something like "PYTHONANALYSINGSOURCE" with the effects:
- bytecode files are neither read nor written - all bytecode and AST optimisations are disabled
A use case oriented flag like that lets us tweak the definition as needed in the future, unlike an option that is specific to turning off the CPython peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it off would still be covered by an "analysing source" flag).
My inclination would still be to provide separate controls like "DISABLE_OPTIMIZATIONS" and "DISABLE_BYTECODE", these are power tools in any case. What is the process from this point forward? A patch? A PEP? --Ned.
Cheers, Nick.
-- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org <mailto:Python-ideas@python.org> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org <mailto:Python-ideas@python.org> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 8 Jun 2014 21:45, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
On 5/26/14 10:40 PM, Nick Coghlan wrote:
On 27 May 2014 10:28, "Ned Batchelder" <ned@nedbatchelder.com> wrote:
On 5/23/14 1:22 PM, Guido van Rossum wrote:
On Fri, May 23, 2014 at 10:17 AM, Eric Snow <
On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum <guido@python.org>
wrote:
Would it be a problem if .pyc files weren't generated or used (a la
-B
or PYTHONDONTWRITEBYTECODE) when you ran coverage?
In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports.
I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of
In any case, it seems that the penalty for avoiding the .pyc files is
not burdensome.
Along these lines, how about making the environment variable something
ericsnowcurrently@gmail.com> wrote: these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables! like "PYTHONANALYSINGSOURCE" with the effects:
- bytecode files are neither read nor written - all bytecode and AST optimisations are disabled
A use case oriented flag like that lets us tweak the definition as
needed in the future, unlike an option that is specific to turning off the CPython peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it off would still be covered by an "analysing source" flag).
My inclination would still be to provide separate controls like "DISABLE_OPTIMIZATIONS" and "DISABLE_BYTECODE", these are power tools in any case. What is the process from this point forward? A patch? A PEP?
A PEP would help ensure the use cases are clearly documented and properly covered by the chosen solution. It will also help cover all the incidental details (like the impact on cache tags). But either a patch or a PEP would get it moving - the main risk in going direct to a patch is the potential for needing to rework the design. Cheers, Nick.
--Ned.
Cheers, Nick.
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 24 May 2014 02:49, "Guido van Rossum" <guido@python.org> wrote:
I'm not happy with the direction this is taking. I would prefer an
approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. Sure, that sounds like a reasonable approach, too. My perspective is mainly coloured by the fact that we're still in the "eh, feature freeze is still more than a year away" low urgency period for 3.5 :) Cheers, Nick.

On Fri, May 23, 2014 at 10:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 24 May 2014 02:49, "Guido van Rossum" <guido@python.org> wrote:
I'm not happy with the direction this is taking. I would prefer an
approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files.
Sure, that sounds like a reasonable approach, too. My perspective is mainly coloured by the fact that we're still in the "eh, feature freeze is still more than a year away" low urgency period for 3.5 :)
Yeah, and I'm countering that not every project needs to land a week before the feature freeze. :-) -- --Guido van Rossum (python.org/~guido)

On 24 May 2014 03:24, "Guido van Rossum" <guido@python.org> wrote:
On Fri, May 23, 2014 at 10:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 24 May 2014 02:49, "Guido van Rossum" <guido@python.org> wrote:
I'm not happy with the direction this is taking. I would prefer an
Sure, that sounds like a reasonable approach, too. My perspective is
approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files. mainly coloured by the fact that we're still in the "eh, feature freeze is still more than a year away" low urgency period for 3.5 :)
Yeah, and I'm countering that not every project needs to land a week
before the feature freeze. :-) But that approach makes Larry's life far more exciting! :) Happily-on-the-other-side-of-the-Pacific-from-Larry-while-saying-that'ly yours, Nick.
-- --Guido van Rossum (python.org/~guido)

On May 23, 2014, at 09:49 AM, Guido van Rossum wrote:
I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels.
In fact, Debian (and I'm sure other OSes with package managers) precompiles source files at installation time. We have a couple of bugs languishing to provide -OO optimization levels as an option when doing this precompilation. I haven't pushed this forward because I got side-tracked by the overloading of .pyo files for -O and -OO optimization levels. I agree that the flags, mechanisms, and semantics should be worked out first, but I also think that PEP 3147 tagging will provide a nice ui for the file system representation of the optimization levels. death-to-pyo-files-ly y'rs, -Barry

: On Thu, May 22, 2014 at 12:49:32PM -0600, Eric Snow wrote:
-O0 -- no optmizations at all [...] -OO -- same as -O<max>s (deprecate)
Making no optimization so easily visually confused with maximum optimization isn't terribly good UI ... -[]z. -- Zero Piraeus: inter caetera http://etiol.net/pubkey.asc

On 21 May 2014 12:05, Ned Batchelder <ned@nedbatchelder.com> wrote:
Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option.
The idea sounds reasonable (pretty specialised, but that's OK). But one pitfall is that unless you encode the PYTHONPEEPHOLE setting in the bytecode filename then people will have to remember to delete all bytecode files before using the flag, or the interpreter will pick up an optimised pyc file. Or maybe pyc/pyo files should be ignored if PYTHONPEEPHOLE is set? That's probably simpler. Paul

On Wed May 21 2014 at 9:05:48 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 21 May 2014 12:05, Ned Batchelder <ned@nedbatchelder.com> wrote:
Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option.
The idea sounds reasonable (pretty specialised, but that's OK). But one pitfall is that unless you encode the PYTHONPEEPHOLE setting in the bytecode filename then people will have to remember to delete all bytecode files before using the flag, or the interpreter will pick up an optimised pyc file. Or maybe pyc/pyo files should be ignored if PYTHONPEEPHOLE is set? That's probably simpler.
There are constant rumblings about trying to make .pyc/.pyo aware of what optimizations were applied so that this kind of thing wouldn't occur. It would require tweaking how optimizations are expressed/added so that they are more easily controlled and can somehow contribute to the labeling of what optimizations were applied. All totally doable but will require thinking about the proper API and such (reading .pyc/.pyo files would also break but that's happened before when we added file size to the header and .pyc/.pyo files are viewed as internal optimizations anyway).

Brett Cannon, 21.05.2014 15:51:
There are constant rumblings about trying to make .pyc/.pyo aware of what optimizations were applied so that this kind of thing wouldn't occur. It would require tweaking how optimizations are expressed/added so that they are more easily controlled and can somehow contribute to the labeling of what optimizations were applied. All totally doable but will require thinking about the proper API and such (reading .pyc/.pyo files would also break but that's happened before when we added file size to the header and .pyc/.pyo files are viewed as internal optimizations anyway).
It might be possible to move the peephole optimiser run into the code loader, i.e. the .pyc files could be written out *before* it runs, as plain unoptimised byte code. There might be a tiny performance impact on load, but I doubt that it would be serious. Stefan

On 5/21/14 9:05 AM, Paul Moore wrote:
Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option. The idea sounds reasonable (pretty specialised, but that's OK). But one pitfall is that unless you encode the PYTHONPEEPHOLE setting in
On 21 May 2014 12:05, Ned Batchelder <ned@nedbatchelder.com> wrote: the bytecode filename then people will have to remember to delete all bytecode files before using the flag, or the interpreter will pick up an optimised pyc file. Or maybe pyc/pyo files should be ignored if PYTHONPEEPHOLE is set? That's probably simpler. For my use case, it would be enough to use whatever .pyc files the interpreter finds. For a testing scenario, it is fine to delete all the .pyc files, set PYTHONPEEPHOLE, and then run the test suite to be sure to avoid optimized pyc files.
Paul

Hi, 2014-05-21 13:05 GMT+02:00 Ned Batchelder <ned@nedbatchelder.com>:
A long-standing problem with CPython is that the peephole optimizer cannot be completely disabled.
I had a similar concern why I worked on my astoptimizer project. I wanted to reimplement the peephole optimizer using the AST instead of the bytecode. Since the peephole optimizer is always on, I was not able to compare the bytecode generated by my AST optimizer without peepholer optimizer to the bytecode generated with the peephole optimizer. I would also be curious to see the code before the peepholer optimizer modifies it.
If you execute "python3.4 -m trace -c -m continue.py", it produces this continue.cover file:
1: a = b = c = 0 101: for n in range(100): 100: if n % 2: 50: if n % 4: 50: a += 1
> continue else: 50: b += 1 50: c += 1 1: assert a == 50 and b == 50 and c == 50
This indicates that the continue line is not executed.
I played long hours in gdb and this is a common issue of compiler optimizations. In gdb, sometimes the program looks to go backward or reexecute the same instruction twice. I hate loosing my time with that, I prefer to recompile the whole (C) application with gcc -O0 -ggdb.
** User Interface
Unfortunately, the -O command-line switch does not lend itself to a new value that means, "less optimization than the default." I propose a new switch -P, to control the peephole optimizer, with a value of -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment variable would also control the option.
I propose "python -X nopeephole" , "python -X peephole=0" or "python -X optim=0" I don't like "python -X peephole=0" because "python -X peephole" should active the optimizer, which is already the default. For "optim" proposition, should we keep it synchronous with -O and -OO? (no -O alternative) <=> -X optim=0 (default) => -X optim=1 -O <=> -X optim=2 -OO <=> -X optim=3 I never understand -O and -OO. What is optimized exactly. To me, striping docstrings is not really an "optimization", it should be a different option. Because of this confusion, the peephole optimizer option should maybe be disconnected to -O and -OO. So take "python -X nopeephole". IMO you should not write .pyc or .pyo files if the peephole optimizer is actived. It avoids the question of "was this .pyc generated with or without peephole optimizer?". Usually, when you disable optimizations, you don't care of performances (.pyc are created to speedup Python startup time). I also suggest to add a new flag to the builtin compile() function: PyCF_NO_PEEPHOLE.
There are about a dozen places internal to CPython where optimization level is indicated with an integer, for example, in Py_CompileStringObject. Those uses also don't allow for new values indicating less optimization than the default: 0 and -1 already have meanings. Unless we want to start using -2 for less that the default. I'm not sure we need to provide for those values, or if the PYTHONPEEPHOLE environment variable provides enough control.
Add a new flag to sys.flags: "peephole" or "peephole_optimizer" (boolean, True by default). Victor
participants (17)
-
Barry Warsaw
-
Brett Cannon
-
Donald Stufft
-
Eric Snow
-
Ethan Furman
-
Guido van Rossum
-
Haoyi Li
-
Jonas Wielicki
-
Ned Batchelder
-
Nick Coghlan
-
Paul Moore
-
Stefan Behnel
-
Stefan Krah
-
Steve Dower
-
Steven D'Aprano
-
Victor Stinner
-
Zero Piraeus