[pypy-dev] Flow graphs, backends and JIT

Tue Sep 18 09:35:07 CEST 2012

>>>> 3. Which component actually does the JIT? Is it just a tweak on the code
>>>> generator or are the flow graphs generated differently?
>>>
>>>
>>> The flow graphs are taken from the translator and modified by the JIT
>>> generator.
>>
>>
>> My question is:
>>
>> Does JIT involve another "transformation" of the flow graphs? In normal
>> (non-JIT) code generation some flow graphs are fed to the backend generator.
>> Wich step is different in the JIT case? Does the backend generator get
>> different flow graphs or are the same flow graphs compiled differently by a
>> tweaked code generator?
>
> They get the same flowgraphs.

So, if I understand well, there is no common JIT code among different backends? 
The JIT we have is the C-backend specific? Different backends would need a new 
JIT approach?

>>>> 4. Is there some documentation how to write a backend (code generator)?
>>>> The
>>>> source code is poorly documented and the topic is not mentioned on the
>>>> web
>>>> page. What exactly do I need to implement to have a backend?
>>>
>>>
>>> You mean a JIT backend or a RPython backend?
>>
>>
>>
>> A RPython backend first. Is there any documentation, tutorial, simple toy
>> backend or anything I could start with?
>
> No. In fact, the only RPython backend that is well-maintained is the C one.

OK, so where could I start from? Is there for example some list of flow graphs 
opcodes?

>>> You might find this useful: http://www.aosabook.org/en/pypy.html
>>>
>>
>> OK, that was useful. It seems that the JIT generator is some assembler
>> embedded into the final binary. Does JIT generator share some code with the
>> backend generator?
>
> No.
>
>>
>> Would it be possible to get rid of the normal code generator (leaving only
>> some glue code) and relaying only on the JIT generator, that would produce
>> the whole code?
>
> No. The JIT generator is specialized for dynamic languages not ones
> like RPython, which can be translated to C.
>
>>
>> This would reduce the size of the binary and would not hit performance much,
>> since loops would be generated as usual, only the non-looping execution
>> would be different.
>
> Why would it reduce the size of the binary?

That is my poor understanding, I might be wrong.

In the current approach in a binary there is a compiled machine code, the flow 
graph representation and the JIT compiler. I think we could get rid of (most) 
compiled machine code, leaving only some startup code to spawn the JIT 
compiler. Then, each code path would be compiled by JIT and executed. Loops 
would run as fast as usual. Non-loop code would run slower, but I think this 
would be a minor slowdown. Most importantly, as I understand, the binary 
contains many versions of the same code paths specialized for different types. 
If we throw it out, the binary would be smaller.

This is not a proposal. It is just a try at understanding things.

haael