PEP 511 and bytecode optimizers

Hi, I played with bytecode optimizers. I still have the same opinion: for the PEP 511, code_transformer() should take a code object as input and return a new code object as output. I don't want to put one specific API in the PEP 511 since it means that we will have to maintain this API. If you want a different data structure, you need extra decode/encode steps in your code :-/ Existing projects take a whole Python code object as input. In my experience, there is no such "grand unified API" to manipulate bytecode. The byteplay and codetransformers projects (and now my bytecode project) use a different and somehow incompatible API. You should pick the one which fits best your use case (and your coding style). I wrote a new bytecode project, my own "high-level" API to modify bytecode. I also reimplemented the peephole optimizer of CPython 3.6 in pure Python. More details below. -- Recently, Andrew Barnert suggested to modify the code_transformer() of the PEP 511: https://www.python.org/dev/peps/pep-0511/#code-transformer-method Currently, input and output are Python code objects: def code_transformer(self, code, context): ... return new_code Andrew proposed to use an higher level API (which don't exist yet, he wants to put something into the dis module). I rewrote the peephole optimizer (currently implemented C) in pure Python: https://hg.python.org/sandbox/fatpython/file/6b01409f2e10/Lib/peephole_opt.p... The Python code is still low-level. It is based on the C code which modifies directly bytes (a bytearray object in my code). I added an "Instr" (bytecode instruction) class but it's a minor abstraction. The C peephole optimizer has many "is_basic_block(offset, size)" checks to ensure that we respect the control flow (don't modify two instructions of two different code paths). I wrote my own bytecode API using blocks: a block is a list of instructions. Jumps use labels, each block has an unique label. The line number is stored directly in an instruction. This API fits well with the peephole optimizer. Any instruction can be removed. Respecting the control flow is obvious: restrict optimizations to one block, optimize all blocks independently. I released bytecode 0.0 which is mostly a proof-of-concept: https://github.com/haypo/bytecode I adapted my Python peephole optimizer on top of my bytecode project: https://github.com/haypo/bytecode/blob/a17baedf5ce89622e16dcf7bf1de8094525f4... bytecode should be enhanced. For example, it is unable to compute the stack level. The API may move to something closer to byteplay: manipulate directly variable names for LOAD_FAST, pass directly the constant value to LOAD_CONST, ... rather than having to use an integer index of a separated list. Victor
participants (1)
-
Victor Stinner