[Python-Dev] compiler optimizations: collecting ideas

Stefan Behnel stefan_ml at behnel.de
Fri Nov 14 19:59:24 CET 2008

Daniel Furrer wrote:
> As part of an advanced compiler design course at our university (ETH
> Zurich), we have to implement an optimization (or various thereof).
> I've spoken with a couple of people who are, like me, very fascinated by
> Python.
> So I would just like to ask if somebody has any ideas on what we could do?

Plenty! :)

> Our constraints:
> - We are 4 persons, each willing to potentially spend around 4-6 full days
> of work.
> - We've been working on generating Control Flow Graphs, generating Static
> Single Assignment Form (inserting phi-nodes) and removing it again. We've
> also implemented techniques like constant folding, copy propagation, Common
> Subexpression Elimination etc.
> We're looking for:
> - Hints as to which python compiler would be best to work on. (The official
> Python compiler package seems very nice to work with actually...)

Cython is a Python-to-C compiler that is itself written in Python. It's good
enough to compile major parts of itself already, and the generated C code is
fast enough to challenge even experienced C programmers.


A really nice thing about Cython is that it targets efficient C code, not
optimised assembly. This allows Cython to focus its optimisations on things
that the C compiler cannot do, i.e. it tries to generate C code that makes it
clear to the C compiler what the intentions of the Python code were, and can
then leave the platform specific and low-level optimisations to the C compiler.

The optimisation infrastructure works in tree transformations through the
Visitor pattern, so it's actually quite simple to add new optimisations. A
couple of examples:


> - If there are any requests in the area of optimization that we could have a
> look at

Our Wiki has a some (more or less up-to-date) enhancement proposals (CEPs):


But a major thing that Cython currently lacks (and that would definitely allow
many new optimisations) is a real Control Flow Graph. It has some initial flow
control tracking, but that mainly targeted variable initialisation at the
time. Many other optimisations, especially type inference attempts, largely
depend on better flow control tracking.

There are some initial thoughts here:


It's not very complete yet, but this is definitely something that is worth
some more discussion on our mailing list.


More information about the Python-Dev mailing list