
Hi Haael, Here is again a high-level overview. Although we use the term "backend" for both, there are two completely unrelated components: the JIT backends and the translation backends. The translation backends are part of the static translation of a PyPy (with or without the JIT) to C code. The translation backends turn control flow graphs into, say, C source code representing them. These control flow graphs are roughly at the same level as Java VM opcodes, except that depending on the backend, they may either contain GC operations (e.g. when translating to Java or CLI) or not any more (e.g. when translating to C). We have control flow graphs for each RPython function in the source code of PyPy, describing an interpreter for Python. Now the JIT is an optional part of that, which is written as more RPython code --- and gets statically translated into more control flow graphs, but describing only the JIT itself, not any JITted code. JITted code (in the form of machine code) is produced at runtime, obviously, but using different techniques. It is the job of the JIT backends to produce this machine code in memory. This is unrelated to the translation backends: a JIT backend inputs something that is not a control flow graph (but a linear "trace" of operations), works at runtime (so is itself written in RPython), and outputs machine code in memory (rather than writing C sources into a file). The input for the JIT backend comes from a front-end component: the tracing JIT "metacompiler". It works by following what the interpreter would do for some specific input (i.e. the precise Python code we see at runtime). This means that the JIT front-end starts with the control flow graphs of the interpreter and produces a linear trace out of it, which is fed to the JIT backend. The control flow graphs in questions must be available at runtime, so we need to serialize them. The precise format in which the flow graphs are serialized is called "JitCodes". Although very similar to the flow graphs, everything that is unnecessary for the JIT was removed, most importantly the details of the type information --- e.g. all sizes and signedness of integer variables are all represented as one "int" type, because the JIT wouldn't have use for more; and similarly any GC pointer to any object is represented as just one "GC pointer" type. I hope this helps :-) A bientôt, Armin.