[Python-Dev] Optimizations on the AST representation
Rune Holm
runehol at ping.uio.no
Wed Nov 2 20:18:52 CET 2005
Hi,
I'm a norwegian applied mathematics student with an interest in
compilers, and I've been a long-time python user and a python-dev lurker
for some time. I'm very happy that you've integrated the AST branch into
mainline, but I noticed that the AST compiler does not perform much
optimization yet, so I though I'd take a crack at it.
I just submitted the following patches:
http://www.python.org/sf/1346214
http://www.python.org/sf/1346238
which adds better dead code elimination and constant folding of the AST
representation to Python.
The constant folding patch adds two new files, Include/optimize.h and
Python/optimize.c, which includes a general visitor interface abstracted
from exisiting visitor code for easy optimization pass creation. The
code is in new files in order to make room for more AST optimization
passes, and since Python/compile.c is already quite
crowded with byte code generation and bytecode optimization. If desired,
this patch could changed to add code to Python/compile.c instead.
Further work:
A limited form of type interference (e.g. as a synthesized attribute)
could be very useful for further optimizations. Since python allows
operator overloading, it isn't safe to perform strength reductions on
expressions with operands of unknown type, as there is no way to know if
algebraic identities will hold. However, if we can infer from the
context that expressions have the type of int, float or long, many
optimizations become possible, for instance:
x**2 => x*x
x*2 => x+x
x*0 => 0
x*1 => x
4*x + 5*x => 9*x (this optimization actually requires common
subexpression elimination for the general case, but simple cases can be
performed without this)
and so forth.
Another interesting optimization that can potensially bring a lot of
additional speed is hoisting of loop invariants, since calling python
methods involves allocating and creating a method-wrapper object. An
informal test shows that optimizing
lst = []
for i in range(10):
lst.append(i+1)
into
lst = []
tmp = lst.append
for i in range(10):
tmp(i+1)
will yield a 10% speed increase. This operation is of course not safe
with arbitrary types, but with the above type interference, we could
perform this operation if the object is of a type that disallows
attribute assignment, for instance lists, tuples, strings and unicode
strings.
Regards,
Rune Holm
More information about the Python-Dev
mailing list