[Compiler-sig] Visitor pattern
Finn Bock
bckfnn@worldonline.dk
Mon, 15 Apr 2002 14:01:34 GMT
[Jeremy]
>I've been meaning to get the compiler package's visitor documented,
>but I'm not yet sure where to start. (That is, I'm not sure which
>things are novel or unusual and which are obvious.)
>
>I think there are three key features:
>
> - Embed the name of the class being visited in the visitor
> method. This wouldn't be necessary in Java, I assume, because
> each method would have the same name buts its argument list
> would be different.
Java can do that to some extend but lets ignore that and just decide
that the visitor have method like visitAssign() and visitIf().
>
> - The AST walker exploits knowledge of the AST structure to
> provide a default visit() method that will work on any node
> type.
I think it is the walker idea that threw me off. When I hear 'visitor'
together with trees I immediately thinks the visitor pattern in the GOF
book, and there is no walker in that pattern.
> - Responsibility for implementing the pattern is divided between a
> visitor and a walker.
And with no responsibility placed in the nodes (other then the
getChildren() method), right ? That is OK, but it is a bit of a stretch
to call it a visitor (IMHO).
>The visitor implements a visitXXX() method for each node of interest.
>The walker takes a visitor instance and the root of a node graph. It
>applies the visitor methods to the node graph starting with the root.
>
>The second seems to apply regardless of language and can be quite
>convenient. If you're writing a simple symbol table visitor, you only
>care about a few of the node types. The If stmt, e.g., has no effect
>on the symbol table, only its children do. The default method makes
>it possible to write a visitor that only has methods for the nodes it
>cares about.
>
>If we have a full specification of the AST (we do: python.asdl) and we
>assume that the spec includes the children of each node in a "natural"
>order, then we can generate a visitor method automatically. By
>natural, I mean a pre-order traversal of the tree, which has always
>been what I've needed.
That is interesting, I can think of only one times where I would need a
autotraversel: When detecting fastlocals. When generating code I need to
control the traversal order myself.
>In the compiler package, each node type has a method getChildren()
>that returns a tuple containing the children in the natural order.
>
>class If:
>
> def getChildren(self):
> return self.test, self.body, self.orelse
Yuck! For performance reason in jython, I would much rather call a
method on the node that does the traversing:
class If:
def traverse(self, walker):
walker.dispatch(self.test)
for stmt in self.body:
walker.dispatch(stmt)
for stmt in self.orelse:
walker.dispatch(stmt)
That is because calling a method can be cheaper than creating a tuple.
I suppose that I will do something like that and then hide the
'traverse' method from python.
>We should be able to define these as well, since the python.asdl
>presents the children in the natural order.
>
>If we have a visitor that isn't interested in If nodes, then it simply
>doesn't define a visitIf() method. If the AST contains an If node,
>the default method of the walker handles it instead.
>
>The default method on the walker does the following:
>
> for child in node.getChildren():
> if child is not None:
> self.visit(child)
With my 'If' class above the 'default' method becomes:
child.traverse(this)
>The visit() method is defined by the walker. It takes an arbitrary
>node as its argument, looks up the appropriate method on the
>visitor, and calls it. The visitor can also use this method to
>delegate responsibility for method lookup to the walker.
>
>Does that help?
Yes it did, thanks.
A question still remain: if I have a visitTryExcept() method, how would
I then cause or prevent the default traversion of the children? In the
example below I assume that a visitXXX function must deal with its
children itself either one-by-one or by calling 'default()'.
>I'm not sure how much of this maps naturally from Python to Java.
I'm guessing the main problem will be whether the concrete visitor class
must inherit from some base class. I like that requirement, you probably
don't.
The requirement of a visitor base class is easily met if the visitor and
walker is joined into one class. That way an example visitor to find
potential fastlocals would look like this:
class FastLocalVisitor(ASTVisitor):
def __init__():
self.infunc = False
self.mode = 'GET'
def visitFuncDef(self, node):
self.infunc = True
self.default(node)
self.infunc = False
def visitName(self, node):
if self.infunc and self.mode == 'SET':
print node.id, "is a potential fastlocal"
def visitTryExcept(self, node):
self.mode = 'SET'
for e in node.handlers:
self.dispatch(e.name)
self.mode = 'GET'
for e in node.handlers:
self.dispatch(e.type)
self.dispatch(e.body)
self.dispatch(node.body)
self.dispatch(node.orelse)
def visitAssign(self, node):
self.mode = 'SET'
for t in node.targets:
self.dispatch(t)
self.mode = 'GET'
self.dispatch(node.value)
def visitFor(self, node):
self.mode = 'SET'
self.dispatch(node.target)
self.mode = 'GET'
self.dispatch(node.iter)
self.dispatch(node.body)
self.dispatch(node.value)
FastLocalVisitor().visit(tree)
So the ASTVisitor class is publicly defined as:
class ASTVisitor:
def default(self, node): # Should be called traverse IMO
"""Visit each of the children of node."""
def dispatch(self, node): # Should be called visit IMO
"""Visit the node (or call self.default())."""
def visitXXX(self, node):
"""Visit the node of type XXX. Defined in subclass"""
if we want to get fancy, we can consider these methods:
def post_visitXXX(self, node):
"""Visit the node of type XXX after the children have
been visited. Defined in subclass"""
def unhandled_node(self, node):
"""Called when no visitXXX method exists for the node. Defined
in subclass"""
def open_level(self, node):
"""Called just before the visitXXX is called. Defined
in subclass"""
def close_level(self, node):
"""Called just after the post_visitXXX is called. Defined
in subclass"""
I would also like to avoid documenting the use of cls.__name__ in the
dispatching. The cls.__name__ for a java class should not be depended on
because it looks like this: 'org.python.parser.ast.If'. I can deal with
this in the implementation of ASTVisitor.dispatch() but I wouldn't want
other application to depending too much on the __name__ of AST nodes. It
would be better if we added a class attribute to the nodes that
contained the official name.
regards,
finn