[Python-Dev] Optimization of Python ASTs: How should we deal with constant values?

Thomas Lee tom at vector-seven.com
Wed Apr 30 12:15:46 CEST 2008

Hi all,

I've been working on optimization of the AST, including the porting of 
the old bytecode-level optimizations to the AST level. A few questions 
have come up in the process of doing this, all of which are probably 
appropriate for discussion on this list. The code I'm referring to here 
can be found in the tlee-ast-optimize branch. Most of the relevant code 
is in Python/optimize.c and Python/peephole.c.

Nearly all of the bytecode-level optimizations have been moved to the 
AST optimizer with a few exceptions. Most of those waiting to be ported 
are stuck in limbo due to the fact we can't yet inject arbitrary 
PyObject constants into the AST. Examples are tuples of constants and 
the optimization of "LOAD_GLOBAL/LOAD_NAME None" as "LOAD_CONST None".

This leaves us with a few options:

1. A new AST expr node for constant values for types other than Str/Num

I imagine this to be something like Const(PyObject* v), which is 
effectively translated to a "LOAD_CONST v" by the compiler. This trades 
the purity of the AST for a little practicality. A "Const" node has no 
real source representation, it would exist solely for the purpose of 
injecting PyObject constants into the AST.

2. Allow arbitrary annotations to be applied to the AST as compiler hints.

For example, each AST node might have an optional dict that contains a 
set of annotation values. Then when traversing the AST, the compiler 
might do something along the lines of:

if (expr->annotations) {
  PyObject* constvalue = PyDict_GetItemString(expr->annotations, 
  if (constvalue)
    ADDOP_O(c, LOAD_CONST, constvalue, consts)
    VISIT(c, expr, expr)

This is a more general solution if we want to keep other compiler hints 
down the track, but unless somebody can think of another use case this 
is probably overkill.

3. Keep these particular optimizations at the bytecode level.

It would be great to be able to perform the optimizations at a higher 
level, but this would require no effort at all. This would mean two 
passes over the same code at two different levels.

If anybody familiar with this stuff could weigh in on the matter, it 
would be much appreciated. I've got a list of other issues that I need 
to discuss here, but this would be a great start.


More information about the Python-Dev mailing list