[Compiler-sig] compiler-sig project for Python 2.3: new AST
Jeremy Hylton
jeremy@zope.com
Tue, 19 Mar 2002 00:45:28 -0500
These are notes copied from a Wiki at
http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/PythonAST
They are a bit rough, but they set out some of the work I'd
like to do
for Python 2.3. Some of the work belongs in a PEP, but I'm
not ready
to write it yet.
Please reply to compiler-sig@python.org.
Jeremy
A New AST for Python
--------------------
I have proposed some namespace optimizations for Python 2.3.
These
optimizations require more analysis by the compiler, which
is quite
difficult given the current parse tree representation. As a
first step
towards those optimizations, I plan to introduce a new AST
that makes
analysis easier. This step is a major undertaking by itself.
The primary goal of the new AST is to provide better
intermediate
representation(s) so that the compiler is easier to maintain
and
extend. One benefit is to enable optimizations by providing
better
tools for analyzing programs during bytecode compiling.
I expect the new AST will be based largely on the one in the
compiler
package (part of the std library in 2.2), which was
originally done by
Greg Stein and Bill Tutt.
Rough plan of action
--------------------
-- Define the AST.
Settle on C and Python (and Java?) implementations of AST.
I like the look of the C code generated by asdlGen, but
haven't had a
chance to get the tool working. See Dan Wang's DSL 97 paper:
The
Zephyr Abstract Syntax Description Language.
Could always write new asdlGen tool for targetted languages
-- Write converter from current parse tree to AST.
Basic functionality of compiler/transformer.py.
Replace parser module.
WingIDE? folks have implemented something like this in C.
parsetools
-- Expose AST to Python programs
Replace parser module and parts of compiler package.
asdlGen has a notion of AST pickles. Would that be
sufficient?
-- An AST visitor in C?
The visitor pattern is very useful for manipulating ASTs? in
Python. Could it be translated to C somehow?
-- Reimplement compile.c with new AST
Break it up into several passes: future stmts, doc strings,
symbol
table, code gen.
Not sure how much of the code in the compiler package may be
useful
here. The codegen part is relatively good, the pyassem part
is pretty
ugly.
Testing
-------
I overhauled the compiler for Python 2.1 to implement nested
scopes. I
spent a lot of time debugging the various compiler passes,
usually
because I forgot to handle a particular kind of grammar
production. The new AST should make this a lot easier, but I
would
also like to see a thorough test suite for the new code.
I don't know anything about best practices for testing
compilers. I
imagine that a system to generate sample inputs from the
language
grammar would be helpful for debugging the C code that walks
the AST.