Re: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30

On May 21, 2014, at 1:21 PM, python-ideas-request@python.org wrote:
I propose that we add a way to completely disable the optimizer.
I think this opens a can of worms that is better left closed. * We will have to start running tests both with and without the switch turned on for example (because you're exposing yet another way to run Python with different code). * Over time, I expect that some of the functionality of the peepholer is going to be moved upstream into AST transformations you will have even less ability switch something on-and-off. * The code in-place has been in the code for over a decade and the tracker item has languished for years. That provides some evidence that the "need" here is very small. * I sympathize with "there is an irritating dimple in coverage.py" but that hasn't actually impaired its usability beyond creating a curiosity. Using that a reason to add a new CPython-only command-line switch seems like having the tail wag the dog. * As the other implementations of Python continue to develop, I don't think we should tie their hands with respect to code generation. * Ideally, the peepholer should be thought of as part of the code generation. As compilation improves over time, it should start to generate the same code as we're getting now. It probably isn't wise to expose the implementation detail that the constant folding and jump tweaks are done in a separate second pass. * Mostly, I don't want to open a new crack in the Python veneer where people are switching on and off two different streams of code generation (currently, there is one way to do it). I can't fully articulate my instincts here, but I think we'll regret opening this door when we didn't have to. That being said, I know how the politics of python-ideas works and I expect that my thoughts on the subject will quickly get buried by a discussion of which lettercode should be used for the command-line switch. Hopefully, some readers will focus on the question of whether it is worth it. Others might look at ways to improve the existing code (without an off-switch) so that the continue-statement jump-to-jump shows-up in your coverage tool. IMO, adding a new command-line switch is a big deal (we should do it very infrequently, limit it to things with a big payoff, and think about whether there are any downsides). Personally, I don't see any big wins here and have a sense that there are downsides that would make us regret exposing alternate code generation. Raymond

On Wed, May 21, 2014 at 8:44 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
I've certainly been frustrated by this wart in coverage.py's output -- if one uses a dev cycle where you constantly review every uncovered line to make sure that tests are doing what you want, then even a small number of spurious uncovered lines that appear and disappear based on the optimizer's whim can result in a lot of wasted time. (Not to mention the hours wasted the first time I ran into this, trying to figure out why my tests weren't working and writing new ones specifically to target the optimized-out line...) That said, I'm also sympathetic to your point. Isn't the real problem here that the peephole optimizer violates the first rule of optimization ("don't change semantics") by breaking sys.settrace? Couldn't we fix this directly? One approach might be to enhance co_lnotab (if anyone dares touch it) so that it can record that a peepholed jump instruction logically belongs to multiple *different* lines, and when we encounter such an instruction we call the trace function multiple times. Then the peephole optimizer just has to propagate line number information whenever it short-circuits a jump. Or perhaps it would be enough to add a dead-code optimization pass after the peephole optimizer, so that coverage.py can at least see that things like Ned's "continue" didn't actually generate any code. (This is suboptimal as well, since it will still cause coverage.py to produce somewhat confusing output, as if the "continue" line had a comment instead of real code -- but it'd still be better than the status quo.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

(First, shouldn't this be in the "disable all peephole optimizations" thread? Raymond seems to have replied to the digest..!) On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
I agree with this. Adding a command line flag to tinker with code generation may well be opening a can of worms, but "the peephole optimizer shouldn't change semantics" is a more compelling argument, although fixing it from that angle is obviously more involved. One problem is that functions like settrace() expose low-level details to the higher-level semantics. It's a fair question as to whether it should be considered kosher to expose implementation details like the peephole optimizer through such interfaces. I could get behind an implementation that hides the erasure of lines that are still (semantically) being executed, without disabling the peephole optimizer. - Trip On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith <njs@pobox.com> wrote:

On 21 May 2014 23:30, Trip Volpe <trip@flowroute.com> wrote:
While I'm happy to be proved wrong with code, my instinct is that "making sys.settrace work" would likely be too complex to be practical. In any case, as you say, it exposes low-level details, and I would personally consider "glitches" like this as implementation details. To put it another way, I don't consider the exact lines traced by sys.settrace to be part of the semantics of a program, any more than I consider the output of dis.dis to be. So in my view it is acceptable for the optimiser to change the lines that get traced in the way that coverage experienced. Paul.

On 5/21/14 5:37 PM, Ethan Furman wrote:
I'm not sure what can of worms you are imagining. Let's look to our experience with C compilers. They have a switch to disable optimization. What trouble has that brought? When I think of problems with optimizers in C compilers, I think of incorrect or buggy optimizations. I can't think of something that has gone wrong because there was a switch to turn it off. People in this thread have contrasted this proposal with an apparent desire to expand the set of optimizations performed. It seems to me that the complexity and danger lie in expanded optimizations, not disabled ones. --Ned.

On Wed, May 21, 2014 at 07:04:58PM -0400, Ned Batchelder wrote:
I can't think of something that has gone wrong because there was a switch to turn it off.
Are you serious? Somehow I'm reminded of the funroll-loops.info Gentoo parody site. As others mention, there is a difficult to quantify, but very real non-zero cost in introducing new major execution modes.
When I think of problems with optimizers in C compilers, I think of incorrect or buggy optimizations.
Sure, it if were still the early 90s. Most optimization bugs come from inexperienced developers relying on undefined behaviour of one form or another, and Python doesn't suffer from UB quite the way C does.
Agreed, and so I'd suggest a better fix would be removing the peephole optimizer, for the little benefit that it offers, if it could be shown that it really truly does hinder peoples' comprehension of Python. It seems the proposed feature is all about avoiding saying "oh, don't worry about that for the moment" while teaching, assuming the question comes up at all. Adding another special case to disable a minor performance improvement seems pointless when the implementation is slow regardless, kind of along the same lines as adding another -O or -OO flag, and we all know how useful they ended up being. If there really was a problem here, it seems preferable to just remove the optimizer entirely and find more general ways to fix performance without creating a mess. David

On 5/21/14 8:10 PM, dw+python-ideas@hmmz.org wrote:
The point is not about teaching Python. It's about getting useful information from code analysis tools. When you run coverage tools or debuggers, you are hoping to learn something about your code. It is bad when those tools give you incorrect or misleading information. Being able to disable the optimizer will prevent certain kinds of incorrect information. --Ned.

On Wed, 21 May 2014 19:04:58 -0400 Ned Batchelder <ned@nedbatchelder.com> wrote:
Python's usage model does not contain the notion of compiler optimizations. Hardly anybody uses the misnamed -O flags. There is a single compilation mode, which everyone is content with. It is part of the simplicity of the language (or, at least, of CPython); by adding some flags than can affect the level of "optimization" you make the model more complicated to understand for users, and to support for us. (having used coverage several times, I haven't found those missed lines really annoying, by the way; not to the point that I would have wanted a specific command-line flag to disable optimizations) The use case for disabling optimizations in C is to make programs actually debuggable. Python doesn't have that problem. Regards Antoine.

On 22 May 2014 09:52, Antoine Pitrou <solipsis@pitrou.net> wrote:
As a concrete example, note my earlier comment about pyc files. Switching off optimisation results in unoptimised bytecode being written to pyc files, which could then be read in a subsequent (supposedly) optimised run. And vice versa. This may not be a huge problem for the coverage use case, but it does add an extra level of complexity into the model of caching bytecode. Handwaving it away as "not a big deal - just delete the bytecode files before and after the coverage run" doesn't alter the fact that the bytecode caching model isn't handling the new mode properly. Paul

On May 22, 2014, at 10:02 AM, Paul Moore wrote:
Seems to me that PEP 3147 tagging could be extended to describe various optimization levels. It might even be nice to get rid of the overloaded .pyo files. The use of .pyo for both -O and -OO optimization levels causes some issues. -Barry

Am 22.05.2014 10:52, schrieb Antoine Pitrou:
The use case for disabling optimizations in C is to make programs actually debuggable. Python doesn't have that problem.
Well, setting a breakpoint to the 'continue' line in Ned's test code and running it with pdb does NOT trigger the breakpoint. So 'Python doesn't have this problem' is not really true. Thomas

On Thu, May 22, 2014 at 10:50 PM, Thomas Heller <theller@ctypes.org> wrote:
Correct me if I'm wrong, but as I understand it, the problem is that the peephole optimizer eliminated an entire line of code. Would it be possible to have it notice when it merges two pieces from different lines, and somehow mark that the resulting bytecode comes from both lines? That would solve the breakpoint and coverage problems simultaneously. ChrisA

On Thu, May 22, 2014 at 8:05 AM, Chris Angelico <rosuav@gmail.com> wrote:
It seems to me that Ned has revealed a bug in the peephole optimizer. It zapped an entire source line's worth of bytecode, but failed to delete the relevant entry in the line number table of the resulting code object. If I had my druthers, that would be the change I'd prefer. That said, I think Ned's proposal is fairly simple. As for the increased testing load, I think the extra cost would be the duplication of the buildbots (or the adjustment of their setup to test with -O and -O0 flags). Is it still the case that -O effectively does nothing (maybe only eliding __debug__ checks)? Skip

On 5/22/14 9:49 AM, Skip Montanaro wrote:
I think it is the nature of optimization that it will destroy useful information. I don't think it will always be possible to retain enough back-mapping that the optimized code can be understood as if it had not been optimized. For example, the debug issue would still be present: if you run pdb and set a breakpoint on the "continue" line, it will never be hit. Even if the optimizer cleaned up after itself perfectly (in fact, especially so), that breakpoint will still not be hit. You simply cannot reason about optimized code without having to mentally understand the transformations that have been applied. The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when optimizations are harmful, and to avoid them.

On 22.05.2014 17:32, Ned Batchelder wrote:
The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when optimizations are harmful, and to avoid them.
+1 It's regular practice in other languages to disable optimizations when debugging code. I don't see why Python should be different in this respect. Debuggers, testing, coverage and other such tools should be able to invoke a Python runtime mode that let's the compiler work strictly by the book, without applying any kind of optimization. This used to be the default in Python, but over the years, we gradually moved away from this as default, with no options to get the old non-optimizing behavior back. I think it's fine to make safe optimizations default in Python, but there's definitely a need for being able to run Python in a debugger without having it perfectly valid skip code lines (even if they are no ops). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 5/22/2014 11:40 AM, M.-A. Lemburg wrote:
I believe that Python has always had an 'as if' rule that allows more or less 'hidden' optimizations, as long as the net effect of a statement is as defined. 1. By the book, "a,b = b,a" means create a tuple from b,a, unpack the contents to a and b, and delete the reference to the tuple. An obvious optimization is to not create the tuple. As I remember, this was once tried out before tuple unpacking was generalized to iterable unpacking. I don't know if CPython was ever released with that optimization, or if other implementations have or do use it. By the 'as if' rule, it does not matter, even though an allocation tracer (such as the one added to 3.4?) might detect the non-allocation. 2. The manual says ''' @f1(arg) @f2 def func(): pass is equivalent to def func(): pass func = f1(arg)(f2(func)) ''' The equivalent is 'as if', in net effect, not in the detailed process. CPython actually executes (or at least did at one time) def <internal rereference>(): pass func = f1(arg)(f2(<internal reference>)) Ignore f1. The difference can be detected when f2 is called by examining the approriate namespace within f2. When someone filed an issue about the 'bug' of 'func' never being bound to the unwrapped function object, Guido said that he neither wanted to change the doc or the implementation. (Sorry, I cannot find the issue.) 3. "a + b" is *usually* equivalent to "a.__class__.__add__(b)" or possibly "b.__class__.__radd__(a)". However, my understanding is that if a and b are ints, a 'fast path' optimization is applied that bypasses the int.__add slot wrapper. Is so, a call tracer could notice the difference and if unaware of such optimizations, falsely report a problem. 4. Some Python implementations delay object destruction. I suspect that some (many?) do not really destroy objects (zero out the memory block).
This is a different issue from 'disable the peephole optimizer'. -- Terry Jan Reedy

On 23.05.2014 04:07, Terry Reedy wrote:
I was referring to the times before the peephole optimizer was introduced (Python 2.3 and earlier). What's important here is to look at the difference between what the compiler generates by simply following its rule book and the version of the byte code which is the result of running an optimizer on the byte code or even on the AST before running the transform to byte code. Note that I'm not talking about optimizations applied at the VM level implementations of bytecodes and I think neither was Ned.
This is an implementation detail of the VM. The code generated by the compiler is byte code saying rotate the top two arguments on the stack (ROT_TWO).
I'd put that under documentation bug, if at all :-) Note that the function func does get the name "func". It's just not bound to the name in the intermediate step, since the function object serves as parameter to the function f2.
Again, this is an optimization in the implementation of the byte code, not one applied by the compiler. There are quite a few more such optimizations going in the VM.
4. Some Python implementations delay object destruction. I suspect that some (many?) do not really destroy objects (zero out the memory block).
I don't see what this has to do with the compiler. Isn't that just a implementation detail of how GC works on a particular Python platform ?
For me, a key argument for having a runtime mode without compiler optimizations is that the compiler gains more freedom in applying more aggressive optimizations. Tools will no longer have to adapt to whatever optimizations are added with each new Python release, since there will be a defined non-optimized runtime mode they can use as basis for their work. The net result would be faster Pythons and better working debugging tools (well, at least that's the hope ;-). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 5/23/2014 4:25 AM, M.-A. Lemburg wrote:
I have tried to say that the 'rule book' at a particular stage is not a fixed thing. There are several tranformations from source to CPython bytecode. The order and grouping is somewhat a matter of convenience. However, leave that aside. What Ned wants and what Guido has supported is that there be an option to get bytecode that is friendly to execution analysis. They can decide what constraints that places on the end product and therefore on the multiple transformation processes.
Stability is certainly a useful constraint.
The net result would be faster Pythons and better working debugging tools (well, at least that's the hope ;-).
Good point. It appears that rethinking the current -O, -OO will help. -- Terry Jan Reedy

On 05/22/2014 08:32 AM, Ned Batchelder wrote:
Having read through the issue on the tracker, I find myself swayed towards Neds point of view. However, I do still agree with Raymond that a full-fledged command-line switch is overkill, especially since the unoptimized runs are very special-cased (meaning useful for debugging, coverage, curiosity, learning about optimizing, etc). If we had a sys flag that could be set before a module was loaded, then coverage, pdb, etc., could use that to recompile the source, not save a .pyc file, and move forward. For debugging purposes perhaps a `__no_optimize__ = True` or `from __future__ import no_optimize` would help in those cases where you're dropping into the debugger. The dead-code elimination still has a bug to be fixed, though, because if a line has been optimized away trying to set a break-point at it should fail. -- ~Ethan~

On 5/22/14 11:43 AM, Ethan Furman wrote:
I'm perfectly happy to drop the idea of the command-line switch. An environment variable would be a fine way to control this behavior.
I don't understand these ideas, but having to add an import to the top of the file seems like a non-starter to me.
If we get a way to disable optimization, we don't need to fix that bug. Everyone knows that optimized code acts oddly in debuggers. :)
-- ~Ethan~

On 22May2014 08:43, Ethan Furman <ethan@stoneleaf.us> wrote:
I've been with Ned from the first post, but have been playing (slow) catchup on the discussion. I'd personally be fine with a -O0 command line switch in keeping with a somewhat common C-compiler convention, or with an environment variable. If all the optimizations in the compiler/interpreter are a distinct step, then having a switch that just says "skip this step, we do not want the naive code transformed at all" seems both desirable and easy. And finally, the sig quote below really did come up at random for this message. Cheers, Cameron Simpson <cs@zip.com.au> We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. - Donald Knuth

On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
In this particular case, the back-mapping problem is pretty minor. IIUC the optimization is that if we have (abusing BASIC notation) 10 GOTO 20 20 GOTO 30 30 ... then in fact the operations at lines 10 and 20 are, from the point of view of the rest of the program, indivisible -- every time you execute 10 you also execute 20, there is no way to tell from outside whether we paused in betwen executing 10 and 20, etc. Effectively we just have a single uber-instruction that does both: (10, 20) GOTO 30 30 ... So from the coverage point of view, just marking line 20 as covered every time line 10 is executed is the Right Thing To Do. From the debugging point of view, a breakpoint set at line 20 should just trip whenever line 10 is executed -- it's not like there's any way to tell whether we're "half way through" the jump sequence or not. It's a pretty solid abstraction. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On 5/22/14 1:16 PM, Nathaniel Smith wrote:
You've used the word "just" three times, glossing over the fact that we have no facility for marking statements as an uber instruction, and you've made no proposal for how it might work. Even if we build (and test!) a way to do that, it only covers this particular kind of oddity with optimized code. --Ned.

On Thu, May 22, 2014 at 9:17 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
What we have right now is co_lnotab. It encodes a many-to-one mapping from bytecode locations to line number: # bytecode offset -> line no lnotab = { 0: 10, 1: 10, 2: 10, 3: 11, 4: 12, ... } AFAIK, the main operations it supports are (a) given a bytecode location, return the relevant line (for backtraces etc.), (b) when executing bytecode, detect transitions from an instruction associated with one line to an instruction associated with another line (for sys.settrace, used by coverage and pdb). def backtrace_lineno(offset): return lnotab[offset] def do_trace(offset1, offset2): if lnotab[offset1] != lnotab[offset2]: call_trace_fn(lnotab[offset2]) My proposal is to make this a many-to-many mapping: lnotab = { 0: {10}, 1: {10}, 2: {10, 11}, # optimized jump 3: {12}, ... } def backtrace_lineno(offset): # if there are multiple linenos, then it's indistinguishable which one the # exception occurred on, so just pick one to display return min(lnotab[offset]) def do_trace(offset1, offset2): for lineno in sorted(lnotab[offset2].difference(lnotab[offset1])): call_trace_fn(lineno) Yes, there is some complexity in practice because currently co_lnotab is a ridiculously optimized data structure for encoding the many-to-one mapping, and so some work needs to be done to come up with a similarly optimized way of encoding a many-to-many mapping. But this is all fundamentally trivial. "Compactly encoding a dict of sets of ints" is not the sort of challenge that we should find daunting and impossible.
Even if we build (and test!) a way to do that, it only covers this particular kind of oddity with optimized code.
Well, this is the only oddity that is causing problems. And future optimizations might well be covered by my proposed mechanism. Any optimization that works by taking in a set of line-number-tagged objects (ast nodes, bytecode instructions, whatever) and spits out a set of new objects could potentially make use of this -- just set the lineno annotation on the output objects to be the union of the lineno annotations on the input objects. Will that actually be enough in practice? Who knows, we'll have to wait until we get there. Trying to handle hypothetical future optimizations now is just borrowing trouble. And even if we do add a minimal-optimization mode, that shouldn't be taken as a blank check to stop worrying about the debuggability of the default-optimization mode, so we'll still need something like this sooner or later. gdb actually works extremely well on optimized C/C++ code -- sure, sometimes it's a bit confusing and you have to recompile with -O0 to wrap your head around what's happening, but gdb keeps working regardless and I almost never bother. And this is because the C/C++ crowd has spent a lot of time on coming up with solid systems for describing really really complicated relationships between compiler output and the original source code -- much worse than the ones we have to deal with. Just throwing up our hands and giving up seems like a rather cowardly solution. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On 21 May 2014 22:24, Antoine Pitrou <solipsis@pitrou.net> wrote:
I tend to agree as well. It's a pretty specialised case, and presumably tools similar to coverage for languages like C manage to deal with the issue. Like Raymond, I can't quite explain my reservations, but it feels like this proposal leans towards overspecifying implementation details, in a way that will limit future development of the optimiser. Paul

On 5/21/14 6:17 PM, Paul Moore wrote:
BTW: As C programmers know, if you want to debug your program, you use the -O0 switch. Debugging is about reasoning about the code rather than executing it. Trying to debug optimized C code is very difficult, because nothing matches your expectations. If, as others in this thread have said, we expect the set of optimizations to grow, the need for an off switch will become greater, even to debug the code.
If by implementation details, you mean the word "peephole", then let's remove it, and simply have a switch that disables all optimization. Rather than limiting the future of the optimizer, it will provide an escape hatch for people who would rather not have the optimizer's effects. --Ned.

On 5/21/2014 6:59 PM, Ned Batchelder wrote:
The presumption of this idea is that there is a proper, canonical unoptimized version of 'compiled Python'. For Python there obviously is not. For CPython, there is not either. What Raymond has been saying is that the output of the CPython compiler is the output of the CPython compiler. Sys.settrace is not intended to mandate. It reports on the operations of a particular version of CPython as well as it can with the line number table it gets. The existence of the table is not mandated by the language definition, but is provided on a best effort basis. Another issue on the tracker points out that if an ast is constructed directly, and then compiled, then 'source line numbers' has no meaning. When I used coverage (last summer) with tested Idle modules, I could not get a reported 100% coverage because coverage counts the body of a final "if __name__ == '__main__':" statement. So I had to visually checked that those were the only 'uncovered' lines. I do not see doing the same for 'uncovered' continue as much different. In either case, coverage could leave such lines out of the denominator. -- Terry Jan Reedy

On 5/22/2014 4:43 AM, Antoine Pitrou wrote:
Not directly, but yes, indirectly via --rcfile=FILE where FILE defaults to .coveragerc and the configuration file has [report] exclude_lines = if __name__ == .__main__.: I believe Ned pointed that out to me when I reported the 'problem' to him. If 'continue' were added under 'exclude_lines', the 'can't get 100% coverage' continue issue should go away also. (Yes, I know it is not quite that simple, as there will be times when continue is skipped that should be reported. But I suspect that there will nearly always be some other line skipped and reported, so that a false 100% will be rare.) -- Terry Jan Reedy

On 5/22/14 2:44 AM, Terry Reedy wrote:
I'd like to understand why we think the Python compiler is different in this regard than a C compiler. We all use C compilers that have a -O0 switch. It's there to disable optimizations so that programs can be debugged. The C compiler also has no "canonical unoptimized compiled output". But the switch is there to make it possible to debug (reason about) the compiled code. I don't care if we have a command line switch or some other mechanism to disable optimizations. I just think it's useful to be able to do it somehow. When this came up 18 months ago on Python-Dev, it was part of a thread about adding more optimizations to CPython. Guido said "+1" to the idea of being able to disable the optimizers (https://mail.python.org/pipermail/python-dev/2012-December/123099.html). Our need is not as great as C's, the unrecognizability of the compiled code is much less, but current optimizations are already interfering with the ability to debug and analyze code, and new optimizations will only broaden the possibility of interference. --Ned.

On 5/22/14 10:29 AM, Paul Moore wrote:
I put this idea here because the discussion on issue2506 got involved enough that someone suggested this was the right place for it. I linked to Guido's sentiment in my initial post here, and had hoped that he would chime in. --Ned.

On 22 May 2014 16:29, Ned Batchelder <ned@nedbatchelder.com> wrote:
OK, thanks for the summary. Personally, I still think the biggest issue is around pyc files. I think any proposal needs an answer to that (even if it's just that no-optimisation mode never reads or writes bytecode files). Expecting users to manually manage pyc files is a bad idea. Well, that and any implementation complexity, which I'll leave to others to consider. Paul

On 22.05.2014 17:39, Paul Moore wrote:
Why not simply have the new option disable writing PYC files ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 22 May 2014 16:41, M.-A. Lemburg <mal@egenix.com> wrote:
Why not simply have the new option disable writing PYC files ?
That's what I said. But you also need to not read them as well, because otherwise you could read an optimised file if the source hasn't changed. Paul

On 22.05.2014 17:46, Paul Moore wrote:
Good point :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 5/22/14 11:49 AM, M.-A. Lemburg wrote:
For the use-case I am considering, it would be best to write .pyc files as usual. These are large test suites that already have detailed choreography, usually involving new working trees for each run, or explicitly deleted pyc files. Avoiding pyc's altogether will slow things down, and test suites are universally considered to take too long as it is. --Ned.

On May 22, 2014 9:40 AM, "Paul Moore" <p.f.moore@gmail.com> wrote:
So the flag for that would be set implicitly? That sounds reasonable (and easy).
As a fallback, Victor already pointed out that changing sys.implementation.cache_tag would be easy too. -eric

On 5/22/2014 9:24 AM, Ned Batchelder wrote:
I'd like to understand why we think the Python compiler is different in this regard than a C compiler.
Python is a different language. But let us not get sidetracked on that.
I read that and it is not to me exactly what his quick, top-posted '+1' really means. I claimed in response to Marc-Andre that CPython has always had an as-if rule and numerous optimizations, some of which cannot, realistically, be disabled. Nor would we really want to disable 'all optimization' (as you requested in your post). My objection to 'disable the peephole optimizer' is that it likely disables too much, and perhaps too little (as more is done with asts). Also, it seems it may add a continuing burden to a relatively small core developer team, which also has an stdlib to maintain. I think we should initially focus on the ghosting of 'continue'. While the coverage problem can be partly solved by adding 'continue' to 'exclude lines', that will not solve the problem of a debugger checkpoint not working. I think you could argue (very Pythonically ;-) that the total machine-time saving of ghosting 'continue' is not worth the extra time waste of humans. I would be happier removing that particular optimization than with adding machinery to make it optional. If, as has been proposed, some or all of the peephole (code) optimizations were moved to the ast stage, where continue jumps are still distinguished by Continue nodes, it might be easier to selectively avoid undesirable ghosting of continue statements. -- Terry Jan Reedy

On Thu, 22 May 2014 22:53:28 -0400 Terry Reedy <tjreedy@udel.edu> wrote:
The number one difference is that people don't compile code explicitly when writing Python code (well, except packagers who call compileall(), and a few advanced uses). So "choosing compilation options" is really not part of the standard workflow for developing in Python. Regards Antoine.

On 5/23/14 5:53 AM, Antoine Pitrou wrote:
That seems an odd distinction to make, given that we already do have ways to control how the compilation step happens, and we are having no trouble imagining other ways to control it. Whether you like those options or not, you have to admit that we do have ways to tell Python how we want compilation to happen.

On Fri, 23 May 2014 06:39:54 -0400 Ned Batchelder <ned@nedbatchelder.com> wrote:
My point is that almost nobody ever cares about them. The standard model for execution Python code is "python mycode.py" or "python -m mymodule". Compilation is invisible for the average user. Regards Antoine.

On 5/23/2014 6:39 AM, Ned Batchelder wrote:
There are not used much, and I doubt that anyone is joyous at the status quo. Which is why your proposal looks more inviting (to me, and I think to some others) as part of a reworking of the clumbsy status quo than as a clumbsy add-on. -- Terry Jan Reedy

On Wed, May 21, 2014 at 11:50 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
AFAICT the only ways to make coverage.py "smart enough" would be: 1) Teach coverage.py to perform a full (sound) reachability analysis on bytecode. 2) Teach coverage.py to notice when a jump instruction doesn't go where you might expect it to based on a naive reading of the source code, and then reverse-engineer from this what sequence of jump instructions that must have been merged to produce the one we observe. I guess in practice this probably would require carrying around a patched copy of the full compiler code from every Python release. The problem here is that the Python compiler is throwing away information that only it has. Asking coverage.py to reconstruct that without help from the compiler isn't reasonable IMO. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On 5/21/14 3:44 PM, Raymond Hettinger wrote:
Yes, this could mean an increased testing burden. But that scale horizontally, and will not require a large amount of engineering work. Besides, what better way to test the optimizer?
I'm perfectly happy to remove the word "peephole" from the feature. If we expect the set of optimizations to grow in the future, then we can expect that more cases of code analysis will be misled by optimizations. All the more reason to establish a way now that will disable all optimizations.
I don't think you should dismiss real users' concerns as a curiosity. We already have -X as a way to provide implementation-specific switches, I'm not sure why the CPython-only nature of this is an issue?
This proposal only applies to CPython.
I'm happy to remove the word "peephole". I think a way to disable optimization is useful. I've heard the concern from a number of coverage.py users. If as we all think, optimizations will expand in CPython, then the number of mis-diagnosed code problems will grow. --Ned.

On May 21, 2014, at 6:51 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
I think it has impacted it’s usability. I’ve certainly burned some amount of time trying to figure out why an optimized line was showing up as uncovered. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Wed, May 21, 2014 at 4:51 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
I buy that to an extent. It would definitely be helpful when adding or changing optimizations, particularly to identify the impact of changes both in sematics and performance. However, work on optimizations isn't too common. Aside from direct work on optimizations, optimization-free testing could be useful for identifying optimizer-related bugs (which I expect are quite rare). However, that doesn't add a lot of benefit over a normal buildbot run considering that each run has few changes it is testing. Having said all that, I think it would still be worth testing with and without optimizations. Unless the optimizations are platform-specific, would we need more than one buildbot running with optimizations turned off?
While the use-case is very specific, I think it's a valid motivator for a means of disabling all optimizations, particularly if disabling optimizations is isolated to a very focused location as you've indicated. The big question then is the impact on implementing optimizations (in general) in the future. There has been talk of AST-based optimizations. Raymond indicates that this makes it harder to conditionally optimize. So how much harder would it make this future optimization work? Is that a premature optimization? <wink>
If optimizations can break coverage tools when run on other Python implementations, does that make a case for a more general command-line option? Or is it just a matter of CPython's optimizations behave badly by breaking some perceived invariants that coverage tools rely on, and other implementations behave correctly? If it's the latter, then perhaps Python needs a few tests added to the test suite that verify that optimizer doesn't break the invariants. Such tests would benefit all implementations. However, even if it's the right approach, if the burden of fixing things is so much more than the burden of adding a no-optimizations option, it may make more sense to just add the option and move on. It's all about who has the time to do something about it. (And of course "Now is better than never. Although never is often better than *right* now.") Of course, if the coverage tools rely on CPython implementation details then an implementation-specific -X option makes even more sense. FWIW, regardless of the scenario a -X option makes practical sense in that it would relatively immediately relieve the (infrequent? but onerous) pain point encountered in coverage tools. However, keep in mind that such an option would not be backported and would not be released until 3.5 (in late 2015). So I suppose it would be more about relieving future pain and not helping current coverage tool users.
The comparison made elsewhere with -O0 option in other compilers is also appropriate here. -eric

On 5/21/14 8:17 PM, Eric Snow wrote:
I don't understand the claim that AST transformations will have less ability to switch something on-and-off. The very term "AST transformations" outlines the implementation: step 1, construct an AST; step 2, transform the AST; step 3, generate code from the AST. My proposal is that a switch would let you skip step 2. This is analogous to the current optimizer, which generates bytecode, then as a separate (and skippable!) step, performs peephole optimizations on that bytecode. --Ned.

On 05/21/2014 03:51 PM, Ned Batchelder wrote:
I think the big part of the problem is that there are more than just peephole optimizations. For example, what about all the fast-path optimizations? Do we want to be able to turn those off? How about the heapq optimizations that Raymond put in a few months ago? As Nick suggested, I think it would be better to fix whichever part is broken and allowing dead code to stay in the bytecode. -- ~Ethan~

On Wed, May 21, 2014 at 8:44 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
I've certainly been frustrated by this wart in coverage.py's output -- if one uses a dev cycle where you constantly review every uncovered line to make sure that tests are doing what you want, then even a small number of spurious uncovered lines that appear and disappear based on the optimizer's whim can result in a lot of wasted time. (Not to mention the hours wasted the first time I ran into this, trying to figure out why my tests weren't working and writing new ones specifically to target the optimized-out line...) That said, I'm also sympathetic to your point. Isn't the real problem here that the peephole optimizer violates the first rule of optimization ("don't change semantics") by breaking sys.settrace? Couldn't we fix this directly? One approach might be to enhance co_lnotab (if anyone dares touch it) so that it can record that a peepholed jump instruction logically belongs to multiple *different* lines, and when we encounter such an instruction we call the trace function multiple times. Then the peephole optimizer just has to propagate line number information whenever it short-circuits a jump. Or perhaps it would be enough to add a dead-code optimization pass after the peephole optimizer, so that coverage.py can at least see that things like Ned's "continue" didn't actually generate any code. (This is suboptimal as well, since it will still cause coverage.py to produce somewhat confusing output, as if the "continue" line had a comment instead of real code -- but it'd still be better than the status quo.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

(First, shouldn't this be in the "disable all peephole optimizations" thread? Raymond seems to have replied to the digest..!) On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
I agree with this. Adding a command line flag to tinker with code generation may well be opening a can of worms, but "the peephole optimizer shouldn't change semantics" is a more compelling argument, although fixing it from that angle is obviously more involved. One problem is that functions like settrace() expose low-level details to the higher-level semantics. It's a fair question as to whether it should be considered kosher to expose implementation details like the peephole optimizer through such interfaces. I could get behind an implementation that hides the erasure of lines that are still (semantically) being executed, without disabling the peephole optimizer. - Trip On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith <njs@pobox.com> wrote:

On 21 May 2014 23:30, Trip Volpe <trip@flowroute.com> wrote:
While I'm happy to be proved wrong with code, my instinct is that "making sys.settrace work" would likely be too complex to be practical. In any case, as you say, it exposes low-level details, and I would personally consider "glitches" like this as implementation details. To put it another way, I don't consider the exact lines traced by sys.settrace to be part of the semantics of a program, any more than I consider the output of dis.dis to be. So in my view it is acceptable for the optimiser to change the lines that get traced in the way that coverage experienced. Paul.

On 5/21/14 5:37 PM, Ethan Furman wrote:
I'm not sure what can of worms you are imagining. Let's look to our experience with C compilers. They have a switch to disable optimization. What trouble has that brought? When I think of problems with optimizers in C compilers, I think of incorrect or buggy optimizations. I can't think of something that has gone wrong because there was a switch to turn it off. People in this thread have contrasted this proposal with an apparent desire to expand the set of optimizations performed. It seems to me that the complexity and danger lie in expanded optimizations, not disabled ones. --Ned.

On Wed, May 21, 2014 at 07:04:58PM -0400, Ned Batchelder wrote:
I can't think of something that has gone wrong because there was a switch to turn it off.
Are you serious? Somehow I'm reminded of the funroll-loops.info Gentoo parody site. As others mention, there is a difficult to quantify, but very real non-zero cost in introducing new major execution modes.
When I think of problems with optimizers in C compilers, I think of incorrect or buggy optimizations.
Sure, it if were still the early 90s. Most optimization bugs come from inexperienced developers relying on undefined behaviour of one form or another, and Python doesn't suffer from UB quite the way C does.
Agreed, and so I'd suggest a better fix would be removing the peephole optimizer, for the little benefit that it offers, if it could be shown that it really truly does hinder peoples' comprehension of Python. It seems the proposed feature is all about avoiding saying "oh, don't worry about that for the moment" while teaching, assuming the question comes up at all. Adding another special case to disable a minor performance improvement seems pointless when the implementation is slow regardless, kind of along the same lines as adding another -O or -OO flag, and we all know how useful they ended up being. If there really was a problem here, it seems preferable to just remove the optimizer entirely and find more general ways to fix performance without creating a mess. David

On 5/21/14 8:10 PM, dw+python-ideas@hmmz.org wrote:
The point is not about teaching Python. It's about getting useful information from code analysis tools. When you run coverage tools or debuggers, you are hoping to learn something about your code. It is bad when those tools give you incorrect or misleading information. Being able to disable the optimizer will prevent certain kinds of incorrect information. --Ned.

On Wed, 21 May 2014 19:04:58 -0400 Ned Batchelder <ned@nedbatchelder.com> wrote:
Python's usage model does not contain the notion of compiler optimizations. Hardly anybody uses the misnamed -O flags. There is a single compilation mode, which everyone is content with. It is part of the simplicity of the language (or, at least, of CPython); by adding some flags than can affect the level of "optimization" you make the model more complicated to understand for users, and to support for us. (having used coverage several times, I haven't found those missed lines really annoying, by the way; not to the point that I would have wanted a specific command-line flag to disable optimizations) The use case for disabling optimizations in C is to make programs actually debuggable. Python doesn't have that problem. Regards Antoine.

On 22 May 2014 09:52, Antoine Pitrou <solipsis@pitrou.net> wrote:
As a concrete example, note my earlier comment about pyc files. Switching off optimisation results in unoptimised bytecode being written to pyc files, which could then be read in a subsequent (supposedly) optimised run. And vice versa. This may not be a huge problem for the coverage use case, but it does add an extra level of complexity into the model of caching bytecode. Handwaving it away as "not a big deal - just delete the bytecode files before and after the coverage run" doesn't alter the fact that the bytecode caching model isn't handling the new mode properly. Paul

On May 22, 2014, at 10:02 AM, Paul Moore wrote:
Seems to me that PEP 3147 tagging could be extended to describe various optimization levels. It might even be nice to get rid of the overloaded .pyo files. The use of .pyo for both -O and -OO optimization levels causes some issues. -Barry

Am 22.05.2014 10:52, schrieb Antoine Pitrou:
The use case for disabling optimizations in C is to make programs actually debuggable. Python doesn't have that problem.
Well, setting a breakpoint to the 'continue' line in Ned's test code and running it with pdb does NOT trigger the breakpoint. So 'Python doesn't have this problem' is not really true. Thomas

On Thu, May 22, 2014 at 10:50 PM, Thomas Heller <theller@ctypes.org> wrote:
Correct me if I'm wrong, but as I understand it, the problem is that the peephole optimizer eliminated an entire line of code. Would it be possible to have it notice when it merges two pieces from different lines, and somehow mark that the resulting bytecode comes from both lines? That would solve the breakpoint and coverage problems simultaneously. ChrisA

On Thu, May 22, 2014 at 8:05 AM, Chris Angelico <rosuav@gmail.com> wrote:
It seems to me that Ned has revealed a bug in the peephole optimizer. It zapped an entire source line's worth of bytecode, but failed to delete the relevant entry in the line number table of the resulting code object. If I had my druthers, that would be the change I'd prefer. That said, I think Ned's proposal is fairly simple. As for the increased testing load, I think the extra cost would be the duplication of the buildbots (or the adjustment of their setup to test with -O and -O0 flags). Is it still the case that -O effectively does nothing (maybe only eliding __debug__ checks)? Skip

On 5/22/14 9:49 AM, Skip Montanaro wrote:
I think it is the nature of optimization that it will destroy useful information. I don't think it will always be possible to retain enough back-mapping that the optimized code can be understood as if it had not been optimized. For example, the debug issue would still be present: if you run pdb and set a breakpoint on the "continue" line, it will never be hit. Even if the optimizer cleaned up after itself perfectly (in fact, especially so), that breakpoint will still not be hit. You simply cannot reason about optimized code without having to mentally understand the transformations that have been applied. The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when optimizations are harmful, and to avoid them.

On 22.05.2014 17:32, Ned Batchelder wrote:
The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when optimizations are harmful, and to avoid them.
+1 It's regular practice in other languages to disable optimizations when debugging code. I don't see why Python should be different in this respect. Debuggers, testing, coverage and other such tools should be able to invoke a Python runtime mode that let's the compiler work strictly by the book, without applying any kind of optimization. This used to be the default in Python, but over the years, we gradually moved away from this as default, with no options to get the old non-optimizing behavior back. I think it's fine to make safe optimizations default in Python, but there's definitely a need for being able to run Python in a debugger without having it perfectly valid skip code lines (even if they are no ops). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 5/22/2014 11:40 AM, M.-A. Lemburg wrote:
I believe that Python has always had an 'as if' rule that allows more or less 'hidden' optimizations, as long as the net effect of a statement is as defined. 1. By the book, "a,b = b,a" means create a tuple from b,a, unpack the contents to a and b, and delete the reference to the tuple. An obvious optimization is to not create the tuple. As I remember, this was once tried out before tuple unpacking was generalized to iterable unpacking. I don't know if CPython was ever released with that optimization, or if other implementations have or do use it. By the 'as if' rule, it does not matter, even though an allocation tracer (such as the one added to 3.4?) might detect the non-allocation. 2. The manual says ''' @f1(arg) @f2 def func(): pass is equivalent to def func(): pass func = f1(arg)(f2(func)) ''' The equivalent is 'as if', in net effect, not in the detailed process. CPython actually executes (or at least did at one time) def <internal rereference>(): pass func = f1(arg)(f2(<internal reference>)) Ignore f1. The difference can be detected when f2 is called by examining the approriate namespace within f2. When someone filed an issue about the 'bug' of 'func' never being bound to the unwrapped function object, Guido said that he neither wanted to change the doc or the implementation. (Sorry, I cannot find the issue.) 3. "a + b" is *usually* equivalent to "a.__class__.__add__(b)" or possibly "b.__class__.__radd__(a)". However, my understanding is that if a and b are ints, a 'fast path' optimization is applied that bypasses the int.__add slot wrapper. Is so, a call tracer could notice the difference and if unaware of such optimizations, falsely report a problem. 4. Some Python implementations delay object destruction. I suspect that some (many?) do not really destroy objects (zero out the memory block).
This is a different issue from 'disable the peephole optimizer'. -- Terry Jan Reedy

On 23.05.2014 04:07, Terry Reedy wrote:
I was referring to the times before the peephole optimizer was introduced (Python 2.3 and earlier). What's important here is to look at the difference between what the compiler generates by simply following its rule book and the version of the byte code which is the result of running an optimizer on the byte code or even on the AST before running the transform to byte code. Note that I'm not talking about optimizations applied at the VM level implementations of bytecodes and I think neither was Ned.
This is an implementation detail of the VM. The code generated by the compiler is byte code saying rotate the top two arguments on the stack (ROT_TWO).
I'd put that under documentation bug, if at all :-) Note that the function func does get the name "func". It's just not bound to the name in the intermediate step, since the function object serves as parameter to the function f2.
Again, this is an optimization in the implementation of the byte code, not one applied by the compiler. There are quite a few more such optimizations going in the VM.
4. Some Python implementations delay object destruction. I suspect that some (many?) do not really destroy objects (zero out the memory block).
I don't see what this has to do with the compiler. Isn't that just a implementation detail of how GC works on a particular Python platform ?
For me, a key argument for having a runtime mode without compiler optimizations is that the compiler gains more freedom in applying more aggressive optimizations. Tools will no longer have to adapt to whatever optimizations are added with each new Python release, since there will be a defined non-optimized runtime mode they can use as basis for their work. The net result would be faster Pythons and better working debugging tools (well, at least that's the hope ;-). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 5/23/2014 4:25 AM, M.-A. Lemburg wrote:
I have tried to say that the 'rule book' at a particular stage is not a fixed thing. There are several tranformations from source to CPython bytecode. The order and grouping is somewhat a matter of convenience. However, leave that aside. What Ned wants and what Guido has supported is that there be an option to get bytecode that is friendly to execution analysis. They can decide what constraints that places on the end product and therefore on the multiple transformation processes.
Stability is certainly a useful constraint.
The net result would be faster Pythons and better working debugging tools (well, at least that's the hope ;-).
Good point. It appears that rethinking the current -O, -OO will help. -- Terry Jan Reedy

On 05/22/2014 08:32 AM, Ned Batchelder wrote:
Having read through the issue on the tracker, I find myself swayed towards Neds point of view. However, I do still agree with Raymond that a full-fledged command-line switch is overkill, especially since the unoptimized runs are very special-cased (meaning useful for debugging, coverage, curiosity, learning about optimizing, etc). If we had a sys flag that could be set before a module was loaded, then coverage, pdb, etc., could use that to recompile the source, not save a .pyc file, and move forward. For debugging purposes perhaps a `__no_optimize__ = True` or `from __future__ import no_optimize` would help in those cases where you're dropping into the debugger. The dead-code elimination still has a bug to be fixed, though, because if a line has been optimized away trying to set a break-point at it should fail. -- ~Ethan~

On 5/22/14 11:43 AM, Ethan Furman wrote:
I'm perfectly happy to drop the idea of the command-line switch. An environment variable would be a fine way to control this behavior.
I don't understand these ideas, but having to add an import to the top of the file seems like a non-starter to me.
If we get a way to disable optimization, we don't need to fix that bug. Everyone knows that optimized code acts oddly in debuggers. :)
-- ~Ethan~

On 22May2014 08:43, Ethan Furman <ethan@stoneleaf.us> wrote:
I've been with Ned from the first post, but have been playing (slow) catchup on the discussion. I'd personally be fine with a -O0 command line switch in keeping with a somewhat common C-compiler convention, or with an environment variable. If all the optimizations in the compiler/interpreter are a distinct step, then having a switch that just says "skip this step, we do not want the naive code transformed at all" seems both desirable and easy. And finally, the sig quote below really did come up at random for this message. Cheers, Cameron Simpson <cs@zip.com.au> We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. - Donald Knuth

On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
In this particular case, the back-mapping problem is pretty minor. IIUC the optimization is that if we have (abusing BASIC notation) 10 GOTO 20 20 GOTO 30 30 ... then in fact the operations at lines 10 and 20 are, from the point of view of the rest of the program, indivisible -- every time you execute 10 you also execute 20, there is no way to tell from outside whether we paused in betwen executing 10 and 20, etc. Effectively we just have a single uber-instruction that does both: (10, 20) GOTO 30 30 ... So from the coverage point of view, just marking line 20 as covered every time line 10 is executed is the Right Thing To Do. From the debugging point of view, a breakpoint set at line 20 should just trip whenever line 10 is executed -- it's not like there's any way to tell whether we're "half way through" the jump sequence or not. It's a pretty solid abstraction. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On 5/22/14 1:16 PM, Nathaniel Smith wrote:
You've used the word "just" three times, glossing over the fact that we have no facility for marking statements as an uber instruction, and you've made no proposal for how it might work. Even if we build (and test!) a way to do that, it only covers this particular kind of oddity with optimized code. --Ned.

On Thu, May 22, 2014 at 9:17 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
What we have right now is co_lnotab. It encodes a many-to-one mapping from bytecode locations to line number: # bytecode offset -> line no lnotab = { 0: 10, 1: 10, 2: 10, 3: 11, 4: 12, ... } AFAIK, the main operations it supports are (a) given a bytecode location, return the relevant line (for backtraces etc.), (b) when executing bytecode, detect transitions from an instruction associated with one line to an instruction associated with another line (for sys.settrace, used by coverage and pdb). def backtrace_lineno(offset): return lnotab[offset] def do_trace(offset1, offset2): if lnotab[offset1] != lnotab[offset2]: call_trace_fn(lnotab[offset2]) My proposal is to make this a many-to-many mapping: lnotab = { 0: {10}, 1: {10}, 2: {10, 11}, # optimized jump 3: {12}, ... } def backtrace_lineno(offset): # if there are multiple linenos, then it's indistinguishable which one the # exception occurred on, so just pick one to display return min(lnotab[offset]) def do_trace(offset1, offset2): for lineno in sorted(lnotab[offset2].difference(lnotab[offset1])): call_trace_fn(lineno) Yes, there is some complexity in practice because currently co_lnotab is a ridiculously optimized data structure for encoding the many-to-one mapping, and so some work needs to be done to come up with a similarly optimized way of encoding a many-to-many mapping. But this is all fundamentally trivial. "Compactly encoding a dict of sets of ints" is not the sort of challenge that we should find daunting and impossible.
Even if we build (and test!) a way to do that, it only covers this particular kind of oddity with optimized code.
Well, this is the only oddity that is causing problems. And future optimizations might well be covered by my proposed mechanism. Any optimization that works by taking in a set of line-number-tagged objects (ast nodes, bytecode instructions, whatever) and spits out a set of new objects could potentially make use of this -- just set the lineno annotation on the output objects to be the union of the lineno annotations on the input objects. Will that actually be enough in practice? Who knows, we'll have to wait until we get there. Trying to handle hypothetical future optimizations now is just borrowing trouble. And even if we do add a minimal-optimization mode, that shouldn't be taken as a blank check to stop worrying about the debuggability of the default-optimization mode, so we'll still need something like this sooner or later. gdb actually works extremely well on optimized C/C++ code -- sure, sometimes it's a bit confusing and you have to recompile with -O0 to wrap your head around what's happening, but gdb keeps working regardless and I almost never bother. And this is because the C/C++ crowd has spent a lot of time on coming up with solid systems for describing really really complicated relationships between compiler output and the original source code -- much worse than the ones we have to deal with. Just throwing up our hands and giving up seems like a rather cowardly solution. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On 21 May 2014 22:24, Antoine Pitrou <solipsis@pitrou.net> wrote:
I tend to agree as well. It's a pretty specialised case, and presumably tools similar to coverage for languages like C manage to deal with the issue. Like Raymond, I can't quite explain my reservations, but it feels like this proposal leans towards overspecifying implementation details, in a way that will limit future development of the optimiser. Paul

On 5/21/14 6:17 PM, Paul Moore wrote:
BTW: As C programmers know, if you want to debug your program, you use the -O0 switch. Debugging is about reasoning about the code rather than executing it. Trying to debug optimized C code is very difficult, because nothing matches your expectations. If, as others in this thread have said, we expect the set of optimizations to grow, the need for an off switch will become greater, even to debug the code.
If by implementation details, you mean the word "peephole", then let's remove it, and simply have a switch that disables all optimization. Rather than limiting the future of the optimizer, it will provide an escape hatch for people who would rather not have the optimizer's effects. --Ned.

On 5/21/2014 6:59 PM, Ned Batchelder wrote:
The presumption of this idea is that there is a proper, canonical unoptimized version of 'compiled Python'. For Python there obviously is not. For CPython, there is not either. What Raymond has been saying is that the output of the CPython compiler is the output of the CPython compiler. Sys.settrace is not intended to mandate. It reports on the operations of a particular version of CPython as well as it can with the line number table it gets. The existence of the table is not mandated by the language definition, but is provided on a best effort basis. Another issue on the tracker points out that if an ast is constructed directly, and then compiled, then 'source line numbers' has no meaning. When I used coverage (last summer) with tested Idle modules, I could not get a reported 100% coverage because coverage counts the body of a final "if __name__ == '__main__':" statement. So I had to visually checked that those were the only 'uncovered' lines. I do not see doing the same for 'uncovered' continue as much different. In either case, coverage could leave such lines out of the denominator. -- Terry Jan Reedy

On 5/22/2014 4:43 AM, Antoine Pitrou wrote:
Not directly, but yes, indirectly via --rcfile=FILE where FILE defaults to .coveragerc and the configuration file has [report] exclude_lines = if __name__ == .__main__.: I believe Ned pointed that out to me when I reported the 'problem' to him. If 'continue' were added under 'exclude_lines', the 'can't get 100% coverage' continue issue should go away also. (Yes, I know it is not quite that simple, as there will be times when continue is skipped that should be reported. But I suspect that there will nearly always be some other line skipped and reported, so that a false 100% will be rare.) -- Terry Jan Reedy

On 5/22/14 2:44 AM, Terry Reedy wrote:
I'd like to understand why we think the Python compiler is different in this regard than a C compiler. We all use C compilers that have a -O0 switch. It's there to disable optimizations so that programs can be debugged. The C compiler also has no "canonical unoptimized compiled output". But the switch is there to make it possible to debug (reason about) the compiled code. I don't care if we have a command line switch or some other mechanism to disable optimizations. I just think it's useful to be able to do it somehow. When this came up 18 months ago on Python-Dev, it was part of a thread about adding more optimizations to CPython. Guido said "+1" to the idea of being able to disable the optimizers (https://mail.python.org/pipermail/python-dev/2012-December/123099.html). Our need is not as great as C's, the unrecognizability of the compiled code is much less, but current optimizations are already interfering with the ability to debug and analyze code, and new optimizations will only broaden the possibility of interference. --Ned.

On 5/22/14 10:29 AM, Paul Moore wrote:
I put this idea here because the discussion on issue2506 got involved enough that someone suggested this was the right place for it. I linked to Guido's sentiment in my initial post here, and had hoped that he would chime in. --Ned.

On 22 May 2014 16:29, Ned Batchelder <ned@nedbatchelder.com> wrote:
OK, thanks for the summary. Personally, I still think the biggest issue is around pyc files. I think any proposal needs an answer to that (even if it's just that no-optimisation mode never reads or writes bytecode files). Expecting users to manually manage pyc files is a bad idea. Well, that and any implementation complexity, which I'll leave to others to consider. Paul

On 22.05.2014 17:39, Paul Moore wrote:
Why not simply have the new option disable writing PYC files ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 22 May 2014 16:41, M.-A. Lemburg <mal@egenix.com> wrote:
Why not simply have the new option disable writing PYC files ?
That's what I said. But you also need to not read them as well, because otherwise you could read an optimised file if the source hasn't changed. Paul

On 22.05.2014 17:46, Paul Moore wrote:
Good point :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2014)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On 5/22/14 11:49 AM, M.-A. Lemburg wrote:
For the use-case I am considering, it would be best to write .pyc files as usual. These are large test suites that already have detailed choreography, usually involving new working trees for each run, or explicitly deleted pyc files. Avoiding pyc's altogether will slow things down, and test suites are universally considered to take too long as it is. --Ned.

On May 22, 2014 9:40 AM, "Paul Moore" <p.f.moore@gmail.com> wrote:
So the flag for that would be set implicitly? That sounds reasonable (and easy).
As a fallback, Victor already pointed out that changing sys.implementation.cache_tag would be easy too. -eric

On 5/22/2014 9:24 AM, Ned Batchelder wrote:
I'd like to understand why we think the Python compiler is different in this regard than a C compiler.
Python is a different language. But let us not get sidetracked on that.
I read that and it is not to me exactly what his quick, top-posted '+1' really means. I claimed in response to Marc-Andre that CPython has always had an as-if rule and numerous optimizations, some of which cannot, realistically, be disabled. Nor would we really want to disable 'all optimization' (as you requested in your post). My objection to 'disable the peephole optimizer' is that it likely disables too much, and perhaps too little (as more is done with asts). Also, it seems it may add a continuing burden to a relatively small core developer team, which also has an stdlib to maintain. I think we should initially focus on the ghosting of 'continue'. While the coverage problem can be partly solved by adding 'continue' to 'exclude lines', that will not solve the problem of a debugger checkpoint not working. I think you could argue (very Pythonically ;-) that the total machine-time saving of ghosting 'continue' is not worth the extra time waste of humans. I would be happier removing that particular optimization than with adding machinery to make it optional. If, as has been proposed, some or all of the peephole (code) optimizations were moved to the ast stage, where continue jumps are still distinguished by Continue nodes, it might be easier to selectively avoid undesirable ghosting of continue statements. -- Terry Jan Reedy

On 05/22/2014 07:53 PM, Terry Reedy wrote:
In the interest of not debating what Guido meant way back when, he has posted (today?) that "I am strictly with Ned here." I think we can count that as a +1 for Ned's request. -- ~Ethan~

On Thu, 22 May 2014 22:53:28 -0400 Terry Reedy <tjreedy@udel.edu> wrote:
The number one difference is that people don't compile code explicitly when writing Python code (well, except packagers who call compileall(), and a few advanced uses). So "choosing compilation options" is really not part of the standard workflow for developing in Python. Regards Antoine.

On 5/23/14 5:53 AM, Antoine Pitrou wrote:
That seems an odd distinction to make, given that we already do have ways to control how the compilation step happens, and we are having no trouble imagining other ways to control it. Whether you like those options or not, you have to admit that we do have ways to tell Python how we want compilation to happen.

On Fri, 23 May 2014 06:39:54 -0400 Ned Batchelder <ned@nedbatchelder.com> wrote:
My point is that almost nobody ever cares about them. The standard model for execution Python code is "python mycode.py" or "python -m mymodule". Compilation is invisible for the average user. Regards Antoine.

On 5/23/2014 6:39 AM, Ned Batchelder wrote:
There are not used much, and I doubt that anyone is joyous at the status quo. Which is why your proposal looks more inviting (to me, and I think to some others) as part of a reworking of the clumbsy status quo than as a clumbsy add-on. -- Terry Jan Reedy
participants (18)
-
Antoine Pitrou
-
Barry Warsaw
-
Cameron Simpson
-
Chris Angelico
-
Donald Stufft
-
dw+python-ideas@hmmz.org
-
Eric Snow
-
Ethan Furman
-
Greg Ewing
-
M.-A. Lemburg
-
Nathaniel Smith
-
Ned Batchelder
-
Paul Moore
-
Raymond Hettinger
-
Skip Montanaro
-
Terry Reedy
-
Thomas Heller
-
Trip Volpe