
On Thu, May 8, 2008 at 8:00 AM, Nick Coghlan ncoghlan@gmail.com wrote:
Jeremy Hylton wrote:
On Wed, May 7, 2008 at 11:43 AM, Thomas Lee tom@vector-seven.com wrote:
Nick Coghlan wrote:
As Thomas mentions in a later message, making it possible to annotate
nodes would permit Functions to be annotated as being a generator at the
AST
stage (currently it is left to the bytecode compiler's symtable
generation
pass to make that determination).
Although I guess an alternative solution to that would be to have
separate
AST nodes for Functions and Generators as well...
I've actually backtracked a little and gone back down the Const path
again.
I know this is the third time I've changed my mind, but it's primarily because annotations tend to get a little clunky (or my implementation
was,
at least). Using Const nodes feels a lot more natural inside the
optimizer.
I think it's going to stick, at least in the short term.
Rather than separate FunctionDef and GeneratorDef nodes, I think a new
bool
attribute (is_generator?) on FunctionDef should do the job nicely.
Further,
I'm thinking we can move the "generator detection code" from the
symtable
into Python/ast.c so the generator/function information is available to
the
optimizer.
Does the optimizer run after the symtable? I'd expect the symbol table information to be essential for almost all analysis, so it should be fine to wait until that pass to mark the generators.
There are a lot of micro-optimisations that are actually context independent, so moving them before the symtable pass should be quite feasible - e.g. replacing "return None" with "return", stripping dead code after a return statement, changing a "if not" statement into an "if" statement with the two suites reversed, changing "(1, 2, 3)" into a stored constant, folding "1 + 2" into the constant "3".
I believe the goal is to see how many of the current bytecode optimisations can actually be brought forward to the AST generation stage, rather than waiting until after the bytecode symtable calculation and compilation passes.
The current structure goes:
tokenisation->AST construction->symtable construction->bytecode compilation->bytecode optimisation
My understanding of what Thomas is trying to do is to make it look more like this:
tokenisation->AST construction->AST optimisation->symtable construction->bytecode compilation
There may still be a direct bytecode optimisation step left at the end if it turns out something relies on information that isn't worked out until the symtable construction pass (and that can't easily be worked out earlier, such as adding an 'is_generator' flag to the FunctionDef AST node).
The symbol table pass is not important for deciding where to place the optimizations. Either the optimization operates on the AST itself, in which case it needs to run before bytecode compilation or it operators on the lowered bytecode representation, in which case it needs to run after bytecode compilation. Since the symbol table passes doesn't change the representation, you ought to be able to put the optimizations before or after it.
But I haven't looked closely at the actual code, so maybe I'm missing something.
Jeremy
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
http://www.boredomandlaziness.org