[Python-Dev] A new JIT compiler for a faster CPython?

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Tue Jul 17 20:44:19 CEST 2012


I'll admit I didn't read through your email, but you should absolutely 
check out Numba which is ramping up just now to do this:

https://github.com/numba

(I'm CC-ing their mailing list, perhaps some of them will read this and 
respond.)

It is probably much less ambitious but that hopefully shouldn't stop you 
cooperating.

It's started by Travis Oliphant (who started NumPy); here's his thoughts 
on PyPy and NumPy which provides some of the background for this project.

http://technicaldiscovery.blogspot.no/2011/10/thoughts-on-porting-numpy-to-pypy.html

Dag

On 07/17/2012 08:38 PM, Victor Stinner wrote:
> Hi,
>
> I would like to write yet another JIT compiler for CPython. Before
> writing anything, I would like your opinion because I don't know well
> other Python compilers. I also want to prepare a possible integration
> into CPython since the beginning of the project, or at least stay very
> close to the CPython project (and CPython developers!). I did not
> understand exactly why Unladen Swallow and psyco projects failed, so
> please tell me if you think that my project is going to fail too!
>
>
> == Why? ==
>
> CPython is still the reference implementation, new features are first
> added to this implementation (ex: PyPy is not supporting Python 3 yet,
> but there is a project to support Python 3). Some projects still rely
> on low level properties of CPython, especially its C API (ex: numpy;
> PyPy has a cpyext module to emulate the CPython C API).
>
> A JIT is the most promising solution to speed up the main evaluation
> loop: using a JIT, it is possible to compile a function for a specific
> type on the fly and so enable deeper optimizations.
>
> psyco is no more maintained. It had its own JIT which is complex to
> maintain. For example, it is hard to port it to a new hardware.
>
> LLVM is fast and the next version will be faster. LLVM has a
> community, a documentation, a lot of tools and is active.
>
> There are many Python compilers which are very fast, but most of them
> only support a subset of Python or require to modify the code (ex:
> specify the type of all parameters and variables). For example, you
> cannot run Django with Shredskin.
>
> IMO PyPy is complex and hard to maintain. PyPy has a design completly
> different than CPython and is much faster and has a better memory
> footprint. I don't expect to be as fast as PyPy, just faster than
> CPython.
>
>
> == General idea ==
>
> I don't want to replace CPython. This is an important point. All
> others Python compilers try to write something completly new, which is
> an huge task and is a problem to stay compatible with CPython. I would
> like to reuse as much as possible code of CPython and don't try to
> fight against the GIL or reference counting, but try to cooperate
> instead.
>
> I would like to use a JIT to generate specialized functions for a
> combinaison of arguments types. Specialization enables more
> optimizations. I would like to use LLVM because LLVM is an active
> project, has many developers and users, is fast and the next version
> will be faster! LLVM already supports common optimizations like
> inlining.
>
> My idea is to emit the same code than ceval.c from the bytecode to be
> fully compatible with CPython, and then write a JIT to optimize
> functions for a specific type.
>
>
> == Roadmap ==
>
> -- Milestone 1: Proof of concept --
>
>   * Use the bytecode produced by CPython parser and compiler
>   * Only compile a single function
>   * Emit the same code than ceval.c using LLVM, but without tracing,
> exceptions nor signal handling (they will be added later)
>   * Support compiling and calling the following functions: "def func(a,
> b): return a+b"
>
> The pymothoa project can be used as a base to implement quickly such
> proof of concept.
>
> -- Milestone 2: Specialized function for the int type --
>
>   * Use type annotation to generate specialized functions for the int type
>   * Use C int with a guard detecting integer overflow to fallback on Python int
>
> -- Milestone 3: JIT --
>
>   * Depending on the type seen at runtime, recompile the function to
> generate specialized functions
>   * Use guard to fallback to a generic implementation if the type is
> not the expected type
>   * Drop maybe the code using function annotations
>
> At this step, we can start to benchmark to check if the (JIT) compiler
> is faster than CPython.
>
> -- Later (unsorted ideas) --
>
>   * Support exceptions
>   * Full support of Python
>
>    - classes
>    - list comprehension
>    - etc.
>
>   * Optimizations:
>
>     - avoid reference counting when possible
>     - avoid temporary objects when possible
>     - release the GIL when possible
>     - inlining: should be very interesting with list comprehension
>     - unroll loops?
>     - lazy creation of the frame?
>
>   * Use registers instead of a stack in the "evaluation loop"?
>   * Add code to allow tracing and profiling
>   * Add code to handle signals (pending calls)
>   * Write a compiler using the AST, with a fallback to the bytecode?
> (would it be faster? easier or more complex to maintain?)
>   * Test LLVM optimizers
>   * Compile a whole module or even a whole program
>   * Reduce memory footprint
>   * Type annotation to help the optimizer? (with guards?)
>   * "const" annotation to help the optimizer? (with guards?)
>   * Support any build option of Python:
>
>     - support Python 2 (2.5, 2.6, 2.7) and 3 (3.1, 3.2, 3.3, 3.4)
>     - support narrow and wide mode: flag at runtime?
>     - support debug and release mode: flag at runtime?
>     - support 32 and 64 bits mode on Windows?
>
>
> == Other Python VM and compilers ==
>
> -- Fully Python compliant --
>
>   * `PyPy<http://pypy.org/>`_
>   * `Jython<http://www.jython.org/>`_ based on the JVM
>   * `IronPython<http://ironpython.net/>`_ based on the .NET VM
>   * `Unladen Swallow<http://code.google.com/p/unladen-swallow/>`_ fork
> of CPython 2.6 using LLVM
>
>     - `Unladen Swallow Retrospective
>       <http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html>`_
>     - `PEP 3146<http://python.org/dev/peps/pep-3146/>`_
>
>   * `psyco<http://psyco.sourceforge.net/>`_ (fully Python compliant?),
> no more maintained
>
> -- Subset of Python to C++ --
>
>   * `Nuitka<http://www.nuitka.net/pages/overview.html>`_
>   * `Python2C<http://strout.net/info/coding/python/ai/python2c.py>`_
>   * `Shedskin<http://code.google.com/p/shedskin/>`_
>   * `pythran<https://github.com/serge-sans-paille/pythran>`_ (no
> class, set, dict, exception, file handling, ...)
>
> -- Subset of Python --
>
>   * `pymothoa<http://code.google.com/p/pymothoa/>`_: use LLVM; don't
> support classes nor exceptions.
>   * `unpython<http://code.google.com/p/unpython/>`_: Python to C
>   * `Perthon<http://perthon.sourceforge.net/>`_: Python to Perl
>   * `Copperhead<http://copperhead.github.com/>`_: Python to GPU (Nvidia)
>
> -- Language very close to Python --
>
>   * `Cython<http://www.cython.org/>`_: "Cython is a programming
> language based on Python, with extra syntax allowing for optional
> static type declarations." Based on `Pyrex
> <http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/>`_
>
>
> == See also ==
>
>   * `Volunteer developed free-threaded cross platform virtual machines?
>     <http://www.boredomandlaziness.org/2012/07/volunteer-supported-free-threaded-cross.html>`_
>
> Victor Stinner
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/d.s.seljebotn%40astro.uio.no



More information about the Python-Dev mailing list