Optimization of Python ASTs: How should we deal with constant values?

Hi all, I've been working on optimization of the AST, including the porting of the old bytecode-level optimizations to the AST level. A few questions have come up in the process of doing this, all of which are probably appropriate for discussion on this list. The code I'm referring to here can be found in the tlee-ast-optimize branch. Most of the relevant code is in Python/optimize.c and Python/peephole.c. Nearly all of the bytecode-level optimizations have been moved to the AST optimizer with a few exceptions. Most of those waiting to be ported are stuck in limbo due to the fact we can't yet inject arbitrary PyObject constants into the AST. Examples are tuples of constants and the optimization of "LOAD_GLOBAL/LOAD_NAME None" as "LOAD_CONST None". This leaves us with a few options: 1. A new AST expr node for constant values for types other than Str/Num I imagine this to be something like Const(PyObject* v), which is effectively translated to a "LOAD_CONST v" by the compiler. This trades the purity of the AST for a little practicality. A "Const" node has no real source representation, it would exist solely for the purpose of injecting PyObject constants into the AST. 2. Allow arbitrary annotations to be applied to the AST as compiler hints. For example, each AST node might have an optional dict that contains a set of annotation values. Then when traversing the AST, the compiler might do something along the lines of: if (expr->annotations) { PyObject* constvalue = PyDict_GetItemString(expr->annotations, "constantvalue"); if (constvalue) ADDOP_O(c, LOAD_CONST, constvalue, consts) else VISIT(c, expr, expr) } This is a more general solution if we want to keep other compiler hints down the track, but unless somebody can think of another use case this is probably overkill. 3. Keep these particular optimizations at the bytecode level. It would be great to be able to perform the optimizations at a higher level, but this would require no effort at all. This would mean two passes over the same code at two different levels. If anybody familiar with this stuff could weigh in on the matter, it would be much appreciated. I've got a list of other issues that I need to discuss here, but this would be a great start. Thanks, Tom

On Wed, Apr 30, 2008 at 3:15 AM, Thomas Lee <tom@vector-seven.com> wrote:
Slight issue with this is that people like Jython are standardizing on our AST, so adding to it does up their burden if they have no use for it.
Possibly, but the AST stuff is so new who knows if it truly will be overkill or not down the road. I personally prefer this approach.
This might end up being the case, but I would rather avoid the overhead if possible. The question becomes how many other possible optimizations might come up that would only work reasonably at the bytecode level. And you forgot option 4): ditching the optimizations that only work on the bytecode. I am not advocating this, but it is an option. -Brett

This leaves us with a few options:
5. Reuse/Abuse Num(object) for arbitrary constants. AFAICT, this should work out of the box.
I think this is the way to go. It doesn't violate purity: it is an *abstract* syntax, meaning that there doesn't need to be a 1:1 relationship to source syntax. However, it is still possible to reproduce source code from this Const node. I also don't worry about Jython conflicts. The grammar has a version number precisely so that you can refer to a specific version if you need to. Regards, Martin

Martin v. Löwis wrote:
Eek. It *does* seem like Num would work out of the box, but would this be a good idea? What about *replacing* Num with Const? Might make optimizations specifically for numeric values slightly hairier, and semantically I think they might be different enough to warrant separate AST nodes despite the similarity in implementation at the compiler level. FWIW, I read Num as "numeric literal" and Const as "arbitrary constant", but that's probably only because I've seen the immediate need for constants with non-Num values in the process of writing the AST optimizer.
I'm leaning toward this, too. It's dirt simple and quite clean to implement.
Any Jython folk care to weigh in on this? If there are no major objections I think I'm going to forge ahead with an independant Const() node. Cheers, T

Jeremy Hylton wrote:
As Thomas mentions in a later message, making it possible to annotate nodes would permit Functions to be annotated as being a generator at the AST stage (currently it is left to the bytecode compiler's symtable generation pass to make that determination). Although I guess an alternative solution to that would be to have separate AST nodes for Functions and Generators as well... Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Nick Coghlan wrote:
I've actually backtracked a little and gone back down the Const path again. I know this is the third time I've changed my mind, but it's primarily because annotations tend to get a little clunky (or my implementation was, at least). Using Const nodes feels a lot more natural inside the optimizer. I think it's going to stick, at least in the short term. Rather than separate FunctionDef and GeneratorDef nodes, I think a new bool attribute (is_generator?) on FunctionDef should do the job nicely. Further, I'm thinking we can move the "generator detection code" from the symtable into Python/ast.c so the generator/function information is available to the optimizer. This is made a little tricky by the absence of the contextual information available that is normally available when flagging generators in the symtable. When generating AST nodes for a suite, we know nothing about the parent node in which the suite resides. Still, it might be doable. If this winds up being ugly, we might need to fall back to the original plan of a separate pass over function bodies to detect yield expressions. I'll look into all this tomorrow night, along with any other crazy suggestions. For now I need to sleep a few hours. :) Thanks for the feedback, it's much appreciated. Cheers, T

On Wed, May 7, 2008 at 11:43 AM, Thomas Lee <tom@vector-seven.com> wrote:
Does the optimizer run after the symtable? I'd expect the symbol table information to be essential for almost all analysis, so it should be fine to wait until that pass to mark the generators. Jeremy

Jeremy Hylton wrote:
There are a lot of micro-optimisations that are actually context independent, so moving them before the symtable pass should be quite feasible - e.g. replacing "return None" with "return", stripping dead code after a return statement, changing a "if not" statement into an "if" statement with the two suites reversed, changing "(1, 2, 3)" into a stored constant, folding "1 + 2" into the constant "3". I believe the goal is to see how many of the current bytecode optimisations can actually be brought forward to the AST generation stage, rather than waiting until after the bytecode symtable calculation and compilation passes. The current structure goes: tokenisation->AST construction->symtable construction->bytecode compilation->bytecode optimisation My understanding of what Thomas is trying to do is to make it look more like this: tokenisation->AST construction->AST optimisation->symtable construction->bytecode compilation There may still be a direct bytecode optimisation step left at the end if it turns out something relies on information that isn't worked out until the symtable construction pass (and that can't easily be worked out earlier, such as adding an 'is_generator' flag to the FunctionDef AST node). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On Thu, May 8, 2008 at 8:00 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
The symbol table pass is not important for deciding where to place the optimizations. Either the optimization operates on the AST itself, in which case it needs to run before bytecode compilation or it operators on the lowered bytecode representation, in which case it needs to run after bytecode compilation. Since the symbol table passes doesn't change the representation, you ought to be able to put the optimizations before or after it. But I haven't looked closely at the actual code, so maybe I'm missing something. Jeremy

Nick Coghlan wrote:
That's been the aim so far. It's been largely successful with the exception of a few edge cases (most notably the functions vs. generator stuff). The elimination of unreachable paths (whether they be things like "if 0: ..." or "return; ... more code ...") completely breaks generators since we might potentially be blowing away "yield" statements during the elimination process. The rest of the optimizations, as far as I can see, are much less scary.
That's exactly right. I made a quick and dirty attempt at moving the AST optimization step after the symtable generation on the train home last night to see if Jeremy's suggestion gives us anything. Now, it does make the detection of generators a little easier, but it really doesn't give us all that much else. I'm happy to post a patch so you can see what I mean, but for now I think visiting the FunctionDef subtree to check for Yield nodes will be fine. Again, I might be wrong (as I've often been throughout this process!) but let's see how it goes. Obviously a bit less efficient, but function bodies really shouldn't be all that deep anyway. I've got a good idea about how I'm going to go forward with this. Cheers, T

Thomas Lee wrote:
While not strictly related to the global statement, perhaps Adam refers to the possibility of optimizing away code with an assignment which would make a name be recognized as local? If you're worried about "yield" disappearing you should also be worried about assignments disappearing, since that might cause names to be interpreted as globals. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/

Steve Holden wrote:
And once you start annotating functions as generators or not, and variable names as locals or cell variables or globals, you're starting to build up a substantial fraction of the information that is already collected during the symtable construction pass. Perhaps the initial attempt at this should just focus on identifying those operations which have the potential to alter the results of the symtable construction, and leave those to the bytecode optimisation step for the moment. Doing the symtable pass twice seems fairly undesirable, even if it does let us trim some dead code out of the AST. Cheers, Nick. P.S. Here's a nonsensical example where optimising away the dead code after the return statement would significantly change the code's semantics:
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On Thu, May 8, 2008 at 5:54 PM, Thomas Lee <tom@vector-seven.com> wrote:
Here we are. In 2.4.4:
2.5.1 correctly raises an UnboundLocalError instead. Searching the bug DB for old bugs involving the optimizer showed several variations on this, such as "if 0: yield None" at module scope not raising a SyntaxError (Issue 1875, still open in trunk). Clearly there needs to be a general approach to avoid affecting these checks with optimizations. -- Adam Olsen, aka Rhamphoryncus

On Wed, Apr 30, 2008 at 3:15 AM, Thomas Lee <tom@vector-seven.com> wrote:
Slight issue with this is that people like Jython are standardizing on our AST, so adding to it does up their burden if they have no use for it.
Possibly, but the AST stuff is so new who knows if it truly will be overkill or not down the road. I personally prefer this approach.
This might end up being the case, but I would rather avoid the overhead if possible. The question becomes how many other possible optimizations might come up that would only work reasonably at the bytecode level. And you forgot option 4): ditching the optimizations that only work on the bytecode. I am not advocating this, but it is an option. -Brett

This leaves us with a few options:
5. Reuse/Abuse Num(object) for arbitrary constants. AFAICT, this should work out of the box.
I think this is the way to go. It doesn't violate purity: it is an *abstract* syntax, meaning that there doesn't need to be a 1:1 relationship to source syntax. However, it is still possible to reproduce source code from this Const node. I also don't worry about Jython conflicts. The grammar has a version number precisely so that you can refer to a specific version if you need to. Regards, Martin

Martin v. Löwis wrote:
Eek. It *does* seem like Num would work out of the box, but would this be a good idea? What about *replacing* Num with Const? Might make optimizations specifically for numeric values slightly hairier, and semantically I think they might be different enough to warrant separate AST nodes despite the similarity in implementation at the compiler level. FWIW, I read Num as "numeric literal" and Const as "arbitrary constant", but that's probably only because I've seen the immediate need for constants with non-Num values in the process of writing the AST optimizer.
I'm leaning toward this, too. It's dirt simple and quite clean to implement.
Any Jython folk care to weigh in on this? If there are no major objections I think I'm going to forge ahead with an independant Const() node. Cheers, T

Jeremy Hylton wrote:
As Thomas mentions in a later message, making it possible to annotate nodes would permit Functions to be annotated as being a generator at the AST stage (currently it is left to the bytecode compiler's symtable generation pass to make that determination). Although I guess an alternative solution to that would be to have separate AST nodes for Functions and Generators as well... Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Nick Coghlan wrote:
I've actually backtracked a little and gone back down the Const path again. I know this is the third time I've changed my mind, but it's primarily because annotations tend to get a little clunky (or my implementation was, at least). Using Const nodes feels a lot more natural inside the optimizer. I think it's going to stick, at least in the short term. Rather than separate FunctionDef and GeneratorDef nodes, I think a new bool attribute (is_generator?) on FunctionDef should do the job nicely. Further, I'm thinking we can move the "generator detection code" from the symtable into Python/ast.c so the generator/function information is available to the optimizer. This is made a little tricky by the absence of the contextual information available that is normally available when flagging generators in the symtable. When generating AST nodes for a suite, we know nothing about the parent node in which the suite resides. Still, it might be doable. If this winds up being ugly, we might need to fall back to the original plan of a separate pass over function bodies to detect yield expressions. I'll look into all this tomorrow night, along with any other crazy suggestions. For now I need to sleep a few hours. :) Thanks for the feedback, it's much appreciated. Cheers, T

On Wed, May 7, 2008 at 11:43 AM, Thomas Lee <tom@vector-seven.com> wrote:
Does the optimizer run after the symtable? I'd expect the symbol table information to be essential for almost all analysis, so it should be fine to wait until that pass to mark the generators. Jeremy

Jeremy Hylton wrote:
There are a lot of micro-optimisations that are actually context independent, so moving them before the symtable pass should be quite feasible - e.g. replacing "return None" with "return", stripping dead code after a return statement, changing a "if not" statement into an "if" statement with the two suites reversed, changing "(1, 2, 3)" into a stored constant, folding "1 + 2" into the constant "3". I believe the goal is to see how many of the current bytecode optimisations can actually be brought forward to the AST generation stage, rather than waiting until after the bytecode symtable calculation and compilation passes. The current structure goes: tokenisation->AST construction->symtable construction->bytecode compilation->bytecode optimisation My understanding of what Thomas is trying to do is to make it look more like this: tokenisation->AST construction->AST optimisation->symtable construction->bytecode compilation There may still be a direct bytecode optimisation step left at the end if it turns out something relies on information that isn't worked out until the symtable construction pass (and that can't easily be worked out earlier, such as adding an 'is_generator' flag to the FunctionDef AST node). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On Thu, May 8, 2008 at 8:00 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
The symbol table pass is not important for deciding where to place the optimizations. Either the optimization operates on the AST itself, in which case it needs to run before bytecode compilation or it operators on the lowered bytecode representation, in which case it needs to run after bytecode compilation. Since the symbol table passes doesn't change the representation, you ought to be able to put the optimizations before or after it. But I haven't looked closely at the actual code, so maybe I'm missing something. Jeremy

Nick Coghlan wrote:
That's been the aim so far. It's been largely successful with the exception of a few edge cases (most notably the functions vs. generator stuff). The elimination of unreachable paths (whether they be things like "if 0: ..." or "return; ... more code ...") completely breaks generators since we might potentially be blowing away "yield" statements during the elimination process. The rest of the optimizations, as far as I can see, are much less scary.
That's exactly right. I made a quick and dirty attempt at moving the AST optimization step after the symtable generation on the train home last night to see if Jeremy's suggestion gives us anything. Now, it does make the detection of generators a little easier, but it really doesn't give us all that much else. I'm happy to post a patch so you can see what I mean, but for now I think visiting the FunctionDef subtree to check for Yield nodes will be fine. Again, I might be wrong (as I've often been throughout this process!) but let's see how it goes. Obviously a bit less efficient, but function bodies really shouldn't be all that deep anyway. I've got a good idea about how I'm going to go forward with this. Cheers, T

Thomas Lee wrote:
While not strictly related to the global statement, perhaps Adam refers to the possibility of optimizing away code with an assignment which would make a name be recognized as local? If you're worried about "yield" disappearing you should also be worried about assignments disappearing, since that might cause names to be interpreted as globals. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/

Steve Holden wrote:
And once you start annotating functions as generators or not, and variable names as locals or cell variables or globals, you're starting to build up a substantial fraction of the information that is already collected during the symtable construction pass. Perhaps the initial attempt at this should just focus on identifying those operations which have the potential to alter the results of the symtable construction, and leave those to the bytecode optimisation step for the moment. Doing the symtable pass twice seems fairly undesirable, even if it does let us trim some dead code out of the AST. Cheers, Nick. P.S. Here's a nonsensical example where optimising away the dead code after the return statement would significantly change the code's semantics:
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On Thu, May 8, 2008 at 5:54 PM, Thomas Lee <tom@vector-seven.com> wrote:
Here we are. In 2.4.4:
2.5.1 correctly raises an UnboundLocalError instead. Searching the bug DB for old bugs involving the optimizer showed several variations on this, such as "if 0: yield None" at module scope not raising a SyntaxError (Issue 1875, still open in trunk). Clearly there needs to be a general approach to avoid affecting these checks with optimizations. -- Adam Olsen, aka Rhamphoryncus
participants (8)
-
"Martin v. Löwis"
-
Adam Olsen
-
Brett Cannon
-
Frank Wierzbicki
-
Jeremy Hylton
-
Nick Coghlan
-
Steve Holden
-
Thomas Lee