How to implement a JIT compiler for Django ?

Hi, Is it possible to use the PyPy JIT compiler embedded into a CPython (Django) application? I would like to translate a Django app into C code then compile the binary with clang to optimize the code with a JIT engine. What do you think? Etienne -- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

On Sun, 28 Jan 2018, Etienne Robillard wrote:
I think that you might be confused about the fundamentals of the technologies involved here. Once you translate a Django app into C code (let's assume this is actually possible for the sake of the argument) and then compile it into machine code using clang there is nothing more left for a JIT to operate upon, because machine code is interpreted directly by the CPU. Tracing JIT engines like PyPy translate bytecode into machine code on the fly taking into account invariants discovered during runtime, which theoretically enables them to outperform machine code generated without knowing the data it processes, and, in any case, run a lot faster than the interpreted byte code. A possible source of confusion is that people often speak of speeding things up with LLVM (or nowadays even GCC) JIT; in most cases this amounts to so called method level JITs where specific isolated functions are compiled on the fly into machine code by the corresponding JIT backend and then called from the bytecode interpreter instead of actually interpreting the original bytecode for the method. Anyways, having that said, I can't even infer what your original line of thinking was to embed what into what to speed up what exactly... -- Sincerely yours, Yury V. Zaytsev

Le 2018-01-28 à 05:11, Yury V. Zaytsev a écrit :
I think that you might be confused about the fundamentals of the technologies involved here.
Yes, I admit, i'm really just starting to understand PyPy fundamentals and LLVM.
I'm really sure its possible to generate a C or C++ file from human-generated Python code. So far, I want to use the LLVM backend (PyPy) to translate CPython classes into a tracing JIT compiler...
I'm positive you can use PyPy in embedded C/C++ applications to enable trace compilation of Python objects.
JIT is cool because it can theoretically makes Django and Python web apps outperform C applications.
Anyways, having that said, I can't even infer what your original line of thinking was to embed what into what to speed up what exactly...
I'm looking to use JIT as a replacement for Cython in a upcoming Django-hotsauce release. :) Cheers, Etienne -- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

On Sun, 28 Jan 2018, Etienne Robillard wrote:
If this is indeed your goal, then I do have some good news for you: you don't have to do anything at all other than scraping Cython in order to achieve it. Just run your code on PyPy instead of CPython and you will benefit directly from PyPy's tracing JIT. To answer the question of why do you have to scrape Cython before it comes up: you *can* run Cython compiled modules on top of PyPy thanks to its Python C API emulation layer (CPyExt). However, this will be inefficient because (a) crossing PyPy / C boundary is slow, although not as slow as it used to be in the past and (b) PyPy's JIT can't see into machine code created from Cython-generated C/C++ source code. Therefore, to benefit most from PyPy you'd better feed it pure Python code.
In that case, you have to educate yourself, and specifically try to understand how Cython really works. Hint: generating C/C++ file from Python code != translate Python code into C/C++ code that doesn't require CPython runtime.
So far, I want to use the LLVM backend (PyPy)
PyPy doesn't have anything to do with LLVM at this point. There have been multiple attempts in the past to use LLVM as backend for PyPy, but so far none of them have really been succesful, where success is defined as making it a default backend.
to translate CPython classes into a tracing JIT compiler...
This statement is devoid of meaning.
These statements are both correct, but see the beginning of this email. -- Sincerely yours, Yury V. Zaytsev

Great! Thanks very much for this post, Yury. I'll do just what you suggested. :) Cheers, Etienne Le 2018-01-28 à 07:35, Yury V. Zaytsev a écrit :
-- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

Hi Pim, Le 2018-01-28 à 07:58, Pim van der Eijk (Lists) a écrit :
I'm planning to keep Cython support in the stable branch and experiment with PyPy and clang in the development branch. I believe using a tracing JIT compiler should be a radical improvement over Cython, thanks to PyPy. Cheers, Etienne
-- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

On Sun, 28 Jan 2018, Pim van der Eijk (Lists) wrote:
It really depends on the workload, from what I remember the stumbling block for getting substantial speedups for typical Django applications was the sad ORM story, also in part due to database drivers. My understanding is that Etienne might not be using ORM at all, so it is possible that he'd directly get some decent performance improvements... -- Sincerely yours, Yury V. Zaytsev

On Sun, 28 Jan 2018, Etienne Robillard wrote:
I think that you might be confused about the fundamentals of the technologies involved here. Once you translate a Django app into C code (let's assume this is actually possible for the sake of the argument) and then compile it into machine code using clang there is nothing more left for a JIT to operate upon, because machine code is interpreted directly by the CPU. Tracing JIT engines like PyPy translate bytecode into machine code on the fly taking into account invariants discovered during runtime, which theoretically enables them to outperform machine code generated without knowing the data it processes, and, in any case, run a lot faster than the interpreted byte code. A possible source of confusion is that people often speak of speeding things up with LLVM (or nowadays even GCC) JIT; in most cases this amounts to so called method level JITs where specific isolated functions are compiled on the fly into machine code by the corresponding JIT backend and then called from the bytecode interpreter instead of actually interpreting the original bytecode for the method. Anyways, having that said, I can't even infer what your original line of thinking was to embed what into what to speed up what exactly... -- Sincerely yours, Yury V. Zaytsev

Le 2018-01-28 à 05:11, Yury V. Zaytsev a écrit :
I think that you might be confused about the fundamentals of the technologies involved here.
Yes, I admit, i'm really just starting to understand PyPy fundamentals and LLVM.
I'm really sure its possible to generate a C or C++ file from human-generated Python code. So far, I want to use the LLVM backend (PyPy) to translate CPython classes into a tracing JIT compiler...
I'm positive you can use PyPy in embedded C/C++ applications to enable trace compilation of Python objects.
JIT is cool because it can theoretically makes Django and Python web apps outperform C applications.
Anyways, having that said, I can't even infer what your original line of thinking was to embed what into what to speed up what exactly...
I'm looking to use JIT as a replacement for Cython in a upcoming Django-hotsauce release. :) Cheers, Etienne -- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

On Sun, 28 Jan 2018, Etienne Robillard wrote:
If this is indeed your goal, then I do have some good news for you: you don't have to do anything at all other than scraping Cython in order to achieve it. Just run your code on PyPy instead of CPython and you will benefit directly from PyPy's tracing JIT. To answer the question of why do you have to scrape Cython before it comes up: you *can* run Cython compiled modules on top of PyPy thanks to its Python C API emulation layer (CPyExt). However, this will be inefficient because (a) crossing PyPy / C boundary is slow, although not as slow as it used to be in the past and (b) PyPy's JIT can't see into machine code created from Cython-generated C/C++ source code. Therefore, to benefit most from PyPy you'd better feed it pure Python code.
In that case, you have to educate yourself, and specifically try to understand how Cython really works. Hint: generating C/C++ file from Python code != translate Python code into C/C++ code that doesn't require CPython runtime.
So far, I want to use the LLVM backend (PyPy)
PyPy doesn't have anything to do with LLVM at this point. There have been multiple attempts in the past to use LLVM as backend for PyPy, but so far none of them have really been succesful, where success is defined as making it a default backend.
to translate CPython classes into a tracing JIT compiler...
This statement is devoid of meaning.
These statements are both correct, but see the beginning of this email. -- Sincerely yours, Yury V. Zaytsev

Great! Thanks very much for this post, Yury. I'll do just what you suggested. :) Cheers, Etienne Le 2018-01-28 à 07:35, Yury V. Zaytsev a écrit :
-- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

Hi Pim, Le 2018-01-28 à 07:58, Pim van der Eijk (Lists) a écrit :
I'm planning to keep Cython support in the stable branch and experiment with PyPy and clang in the development branch. I believe using a tracing JIT compiler should be a radical improvement over Cython, thanks to PyPy. Cheers, Etienne
-- Etienne Robillard tkadm30@yandex.com https://www.isotopesoftware.ca/

On Sun, 28 Jan 2018, Pim van der Eijk (Lists) wrote:
It really depends on the workload, from what I remember the stumbling block for getting substantial speedups for typical Django applications was the sad ORM story, also in part due to database drivers. My understanding is that Etienne might not be using ORM at all, so it is possible that he'd directly get some decent performance improvements... -- Sincerely yours, Yury V. Zaytsev
participants (3)
-
Etienne Robillard
-
Pim van der Eijk (Lists)
-
Yury V. Zaytsev