![](https://secure.gravatar.com/avatar/4c01705256aa2160c1354790e8c154db.jpg?s=120&d=mm&r=g)
On 05.02.16 21:58, Andrew Barnert via Python-ideas wrote:
The biggest pain with dealing with the peephole optimizer is that it happens after all the complicated flattening and fixup[^1] the compiler does, which means you have to hack up the jump targets as you go along. The future bytecode optimizers that PEP 511 enables will have the same headache.
But this isn't actually necessary. The optimizer could work on a flat array of instructions[^2] instead of an array of bytes, with relocatable jump targets instead of fixed byte offsets, and then the compiler could do the fixup _after_ the optimization.[^3]
It would break the optimizer APIs, but `PyCode_Optimize` isn't public, and the API proposed by PEP 511 is public, but that PEP isn't even finalized, much less accepted yet.
I don't think we need to expose the intermediate representation any farther along than the `PyCode_Optimize` step.[^4] Just moving the optimize one step earlier in the chain solves more than enough to be worth it.
LGTM. I did have the same idea when added specialized 8-bit opcodes. Some optimization (like constant folding) it is worth to move yet one step earlier, to AST. Other idea - instead of EXTENDED_ARG have two sets of instructions: short 16-bit with 8-bit arg, and long 32-bit with 24-bit arg. For simplicity initially only long instructions are emitted (if it makes sense). After optimization the code is packed using short instructions if possible (now we have all needed information and can do this in one pass). At the same time NOPs are removed. For later stages we easy can unpack the code back to long form.