[Python-Dev] Optimization of Python ASTs: How should we deal with constant values?

Jeremy Hylton jeremy at alum.mit.edu
Thu May 8 15:18:33 CEST 2008


On Thu, May 8, 2008 at 8:00 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Jeremy Hylton wrote:
>
> > On Wed, May 7, 2008 at 11:43 AM, Thomas Lee <tom at vector-seven.com> wrote:
> >
> > > Nick Coghlan wrote:
> > >
> > >
> > > > As Thomas mentions in a later message, making it possible to annotate
> > > >
> > > nodes would permit Functions to be annotated as being a generator at the
> AST
> > > stage (currently it is left to the bytecode compiler's symtable
> generation
> > > pass to make that determination).
> > >
> > > > Although I guess an alternative solution to that would be to have
> separate
> > > >
> > > AST nodes for Functions and Generators as well...
> > >
> > > >
> > > >
> > >  I've actually backtracked a little and gone back down the Const path
> again.
> > > I know this is the third time I've changed my mind, but it's primarily
> > > because annotations tend to get a little clunky (or my implementation
> was,
> > > at least). Using Const nodes feels a lot more natural inside the
> optimizer.
> > > I think it's going to stick, at least in the short term.
> > >
> > >  Rather than separate FunctionDef and GeneratorDef nodes, I think a new
> bool
> > > attribute (is_generator?) on FunctionDef should do the job nicely.
> Further,
> > > I'm thinking we can move the "generator detection code" from the
> symtable
> > > into Python/ast.c so the generator/function information is available to
> the
> > > optimizer.
> > >
> >
> > Does the optimizer run after the symtable?  I'd expect the symbol
> > table information to be essential for almost all analysis, so it
> > should be fine to wait until that pass to mark the generators.
> >
>
>  There are a lot of micro-optimisations that are actually context
> independent, so moving them before the symtable pass should be quite
> feasible - e.g. replacing "return None" with "return", stripping dead code
> after a return statement, changing a "if not" statement into an "if"
> statement with the two suites reversed, changing "(1, 2, 3)" into a stored
> constant, folding "1 + 2" into the constant "3".
>
>  I believe the goal is to see how many of the current bytecode optimisations
> can actually be brought forward to the AST generation stage, rather than
> waiting until after the bytecode symtable calculation and compilation
> passes.
>
>  The current structure goes:
>
>  tokenisation->AST construction->symtable construction->bytecode
> compilation->bytecode optimisation
>
>  My understanding of what Thomas is trying to do is to make it look more
> like this:
>
>  tokenisation->AST construction->AST optimisation->symtable
> construction->bytecode compilation
>
>  There may still be a direct bytecode optimisation step left at the end if
> it turns out something relies on information that isn't worked out until the
> symtable construction pass (and that can't easily be worked out earlier,
> such as adding an 'is_generator' flag to the FunctionDef AST node).

The symbol table pass is not important for deciding where to place the
optimizations.  Either the optimization operates on the AST itself, in
which case it needs to run before bytecode compilation or it operators
on the lowered bytecode representation, in which case it needs to run
after bytecode compilation.  Since the symbol table passes doesn't
change the representation, you ought to be able to put the
optimizations before or after it.

But I haven't looked closely at the actual code, so maybe I'm missing something.

Jeremy

>
>
>
>  Cheers,
>  Nick.
>
>  --
>  Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>  ---------------------------------------------------------------
>             http://www.boredomandlaziness.org
>


More information about the Python-Dev mailing list