[Python-ideas] Standard (portable) bytecode "assembly" format

Fri Feb 26 05:27:00 EST 2016

2016-02-26 6:27 GMT+01:00 Andrew Barnert via Python-ideas
<python-ideas at python.org>:
> Of course we already have such a format today: dis.Bytecode. But it doesn't quite solve the problem, for three reasons:
>
>  * Not accessible from C.

I don't think that it's a real issue. The current trend is more to
rewrite pieces of CPython in Python. importlib is a great example of
that.

importlib is also an interested case because it is a Python module to
improts modules, but we need importlib to import importlib. Hum. Brett
Canon solved this issue by compiling the Python code to a frozen
module.

It means that we can do something similar if we want to rewrite the
peephole optimizer in Python.

>  * Not mutable, and no assembler.

I looked at Bytecode & Instruction objects of dis. They look nice to
"read" bytecode, but not to modify bytecode.

dis.Instruction is not mutable and informations are duplicated. For
example, the operator is stored as name (LOAD_CONST) and code (100).
Argument is stored as int (1), value ("hello") and representation
('"hello"'). It has no methods but attributes like is_jump_target.

dis.Instruction doesn't seem extensible to add new features. Adding
more items to such namedtuple doesn't seem like a nice API to me.

>  * A few things (mainly jump arguments) are still in terms of bytecode bytes.

Hum, this is a problem. The dis is already in the stdlib, you cannot
modify its API in a backward incompatible way.

IMHO it's safer and simpler to add something new (maybe in a new
module), not modify the existing Bytecode & Instruction classes. To
modify bytecode, you need a different structure. For example, jump
targets must be abstracted with labels (or something else).

In my bytecode module, I provide 3 different ways to expose bytecode:

* ConcreteBytecode: instructions close to raw bytes structure,
arguments must be integers
* Bytecode: list of abstract instructions using labels
* BytecodeBlocks: use blocks, a block is a list of instructions with a
label, jumps point to blocks

An instruction is an object which contains a line number, has methods
like is_jump().

Abstract instructions can be modified (lineno, name, op, arg), they
have no size. Concrete instructions have size, attributes cannot be
modified.

Concrete bytecode & instructions is closer to what we already have in
the dis module. I'm not sure that it's useful, maybe I should keep it
private. It's an intermediate format to disassemble and assemble code
objects.

Instr('LOAD_CONST') argument is directly the constant value, so
Bytecode has no "consts" attribute. ConcreteInstr('LOAD_CONST')
agument is an integer: the index to the consts list of the
ConcreteBytecode.

BytecodeBlocks is a "flat" control flow graph (CFG). It is required by
the peephole optimizer to not modify two instructions which are part
of two code paths (two blocks). Side-effect, with blocks, it's trivial
to detect dead code. As you wrote, it's also possible to reorder
blocks to try to avoid jumps.

Note: the current peephole optimizer miss a lot of optimizations on
jumps :-/ Python 3.5 is a little bit better.

> And, fix it well enough, and it also solves the problem I brought up a few weeks ago (http://article.gmane.org/gmane.comp.python.ideas/38431): if PEP 511 is going to provide a builtin API for registering bytecode processors, we should make it feasible to write them.

Again, you don't need to add anything to the stdlib to write a
bytecode optimizer. byteplay, codetransformer, bytecode, etc. projects
are already available.

By the way, I wrote PEP 511 for AST optimizers, not for bytecode
optimizers. Since we can modify AST, bytecode is less interesting.
Writing an optimizer on bytecode depends too much on the
implementation. It may break if we add new bytecode instructions or
modify the format of instructions (as you said).

It's a deliberate choice to leave optimizers out of the stdlib. I
expect that it will take months or years to stabilize the API of an
optimizer, test it with various kinds of applications, etc.

>  * Iterable of (opcode, argval [, line [, ...]]) tuples. The argval is the actual global name, constant value, etc., not the encoded index, etc. For jumps, the argval is just the target instruction itself. The existing dis.Bytecode (with a few minor changes) already fits this type--but so does, say, a list of 3-tuples, which we can much more easily build in C.

You should not use "be usable in C" constraint, or I expect a bad API
:-/ When you modify bytecode, you need many functions which are
revelant to be put in instruction objects. See existing
codetransformer & bytecode projects for examples of methods.

Note: my bytecode.Instr has a simple constructor, it takes 2 or 3
parameters: Instr(lineno, name, arg=UNSET). I don't think that it's
hard to write an helper function in C to emit Instr object if you
*really* want to write C code.

> And PyCode_Assemble is the only new C API function needed.

I don't understand why do you care so much of having a C API. What do
you want to do?

The only need for CPython is to have the most simple peephole
optimizer, basically only optimize jumps. An AST optimizer can do
everything else. I would like to experiment such peephole optimizer
implemented in pure Python.

I'm not sure that writing it in pure Python will kill performances.
The cost of import matters, but only in few use cases. In general,
applications run longer than 1 second and so the cost of import is
negligible. Moreover, .py are only compiled once to .pyc. If .pyc are
precompiled, the speed of the optimizer doesn't matter :-)

>  * Assuming the assembler drops NOPs, we can use NOPs as pseudo-instructions for when you want byteplay-like Label and SetLineNo. The disassembler can optionally even generate them. So, we don't need explicit pseudo-instructions.

For pattern matching, inline Label or SetLineno instrutions are
annoying. For example, if you use the pattern "LOAD_CONST <value>;
UNARY_NOT", "SetLineno 3; LOAD_CONST <value>; SetLineno 3; UNARY_NOT"
will not match. You *can* modify the algorithm to match patterns, but
putting line numbers in instructions avoid this issue. Using multiple
blocks rather than a single list of instructions avoid the need of
inline labels.

In my bytecode project, I tried to support both API: inline labels in
Bytecode, labels in blocks in BytecodeBlocks. I may add support for
Setlineno later. I'm still working on the API.

>  * Any higher-level representations, like a graph of blocks with edges for the jumps between them, are easy enough to build on top of the dis representation (and to flatten back into that representation), so we don't need anything more complicated in the stdlib.

Yeah, you should start with something simple but extensible. An API
generic enough to be usable as a low-level API by existing byteplay,
codetransformer, bytecode projects, and then build an higher-level API
on top of that. Or maybe I'm right and it's a bad idea :-)

codetransformer is more than just an API to modify bytecode. It has an
API to match instructions using patterns. Such stuff should be kept in
codetransformer.

Victor