[Python-Dev] Optimizations on the AST representation

Rune Holm runehol at ping.uio.no
Wed Nov 2 20:18:52 CET 2005


Hi,

I'm a norwegian applied mathematics student with an interest in 
compilers, and I've been a long-time python user and a python-dev lurker 
for some time. I'm very happy that you've integrated the AST branch into 
mainline, but I noticed that the AST compiler does not perform much 
optimization yet, so I though I'd take a crack at it.

I just submitted the following patches:
http://www.python.org/sf/1346214
http://www.python.org/sf/1346238

which adds better dead code elimination and constant folding of the AST 
representation to Python.
The constant folding patch adds two new files, Include/optimize.h and 
Python/optimize.c, which includes a general visitor interface abstracted 
from exisiting visitor code for easy optimization pass creation. The 
code is in new files in order to make room for more AST optimization 
passes, and since Python/compile.c is already quite
crowded with byte code generation and bytecode optimization. If desired, 
this patch could changed to add code to Python/compile.c instead.


Further work:

A limited form of type interference (e.g. as a synthesized attribute) 
could be very useful for further optimizations. Since python allows 
operator overloading, it isn't safe to perform strength reductions on 
expressions with operands of unknown type, as there is no way to know if 
algebraic identities will hold. However, if we can infer from the 
context that expressions have the type of int, float or long, many 
optimizations become possible, for instance:

x**2 => x*x
x*2 => x+x
x*0 => 0
x*1 => x
4*x + 5*x => 9*x (this optimization actually requires common 
subexpression elimination for the general case, but simple cases can be 
performed without this)

and so forth.

Another interesting optimization that can potensially bring a lot of 
additional speed is hoisting of loop invariants, since calling python 
methods involves allocating and creating a method-wrapper object. An 
informal test shows that optimizing

lst = []
for i in range(10):
    lst.append(i+1)

into

lst = []
tmp = lst.append
for i in range(10):
    tmp(i+1)

will yield a 10% speed increase. This operation is of course not safe 
with arbitrary types, but with the above type interference, we could 
perform this operation if the object is of a type that disallows 
attribute assignment, for instance lists, tuples, strings and unicode 
strings.


Regards,

Rune Holm




More information about the Python-Dev mailing list