
2011/1/1 Steven D'Aprano <steve@pearwood.info>
I wonder whether we need to make that guarantee? Perhaps we should distinguish between "safe" optimizations, like constant folding which can't change behaviour, and "unsafe" optimizations which can go wrong under (presumably) rare circumstances. The compiler can continue to apply whatever safe optimizations it likes, but unsafe optimizations must be explicitly asked for by the user. If subtle or not subtle bugs occur, well, Python does allow people to shoot themselves in the foot.
Do we consider local variable removing (due to internal optimizations) a safe or unsafe operation? Do we consider local variable values "untouchable"? Think about a locals() call that return a list for a variable; lists are mutable objects, so they can be changed by the caller, but the internally generated bytecode can work on a "private" (on stack) copy which doesn't "see" the changes made due to the locals() call. Also, there's the tracing to consider. When trace is enabled, the "handler" cannot find some variables due to some optimizations. Another funny thing that can happen is that if I "group together" some assignment operations into a single, "multiassignment", one (it's another optimization I was thinking about from long time) and you are tracing it, only one tracing event will be generated instead of n. Are such optimizations "legal" / "safe"? For me the answer is yes, because I think that they must be implementation-specific.
Now, *in practice* such manipulations are rare (with the possible
exception of people replacing open() with something providing hooks for e.g. a virtual filesystem) and there is probably some benefit to be had. (I expect that the biggest benefit might well be from replacing len() with an opcode.) I have in the past proposed to change the official semantics of the language subtly to allow such optimizations (i.e. recognizing builtins and replacing them with dedicated opcodes). There should also be a simple way to disable them, e.g. by setting "len = len" at the top of a module, one would be signalling that len() is not to be replaced by an opcode. But it remains messy and nobody has really gotten very far with implementing this. It is certainly not "low-hanging fruit" to do it properly.
Here's another thought... suppose (say) "builtin" became a reserved word. builtin.range (for example) would always refer to the built-in range, and could be optimized by the compiler. It wouldn't do much for the general case of wanting to optimize non-built-in globals, but this could be optimized safely:
def f(): for i in builtin.range(10): builtin.print(i)
while this would keep the current semantics:
def f(): for i in range(10): print(i)
-- Steven
I think that it's not needed. Optimizations must stay behind the scene. We can speedup the code which makes use of builtins without resorting to language changes. JITs already do this, but some ways are possible even on non-JITed VMs. However, they require a longer parse / compile time, which can undesirable. Cesare