[Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...)

Brett Cannon bcannon at gmail.com
Wed Nov 23 01:02:40 CET 2005

On 11/22/05, Steven Bethard <steven.bethard at gmail.com> wrote:
> I wrote (in the summary):
> > While there is no interface to the AST yet, one is
> > intended for the not-so-distant future.
> Simon Burton wrote:
> > who is doing this ? I am mad keen to get this happening.
> Brett Cannon wrote:
> > No one yet.  Some ideas have been tossed around (read the thread for
> > details), but no one has sat down to hammer out the details.  Might
> > happen at PyCon.
> Simon Burton wrote:
> > Yes, i've been reading the threads but I don't see anything
> > about a python interface. Why I'm asking is because I could
> > probably convince my employer to let me (or an intern) work
> > on it. And pycon is not until febuary. I am likely to start
> > hacking on this before then.
> Basically, all I saw was your post asking for a Python interface[1],
> and a few "not yet" responses.  I suspect that if you were to
> volunteer to head up the work on the Python interface, no one would be
> likely to stop you. ;-)
> [1]http://mail.python.org/pipermail/python-dev/2005-October/057611.html

All of the discussion has just been "we hope to have it some day" with
no real planning.  =)  There are two problems to this topic; how to
get the AST structs into Python objects and how to allow Python code
to modify the AST before bytecode emission (or perhaps even after for
in-place optimization).

To get the AST into Python objects, there are two options.  One is to
use the AST grammar to generate struct -> serialized form -> Python
objects and vice-versa.  There might be some rough code already there
in the form of emitting a string that looks like Scheme code that
represents the AST.  Then Python code could use that to make up
objects, manipulate, translate back into its serialized form, and then
back into the AST structs.  It sounds like a lot but with the grammar
right there it should be an automated generation of code to make.

The other option is to have all AST structs be contained in PyObjects.
 Neil suggested this for the whole memory problem since we could then
just do proper refcounting and we all know how to do that (supposedly
=) .  With that then all it is to get access is to pass the PyObject
of the root out and make sure that the proper attributes or accessor
methods (I prefer the former) are available.  Once again this can be
auto-generated from the AST grammar.

The second problem is where to give access to the AST from within
Python.  One place is the command-line.  One could be able to specify
the path to function objects (using import syntax, e.g.,
``optimizations.static.folding``) on the command-line that are always
applied to all generated bytecode.  Another possibility is to have an
iterable in sys that is iterated over everytime something has bytecode
generated.  Each call to the iterator would return a function that
took in an AST object and returned an AST object.  Another possibility
is to have a function (like ``ast()`` as a built-in)  to pass in a
code object and then have the AST returned for that code object.  If a
function was provided that took an AST and returned the bytecode then
selective AST access can be given instead of applying across the board
(this could allow for decorators that performed AST optimizations or
even hotshot stuff).

Obvously this is all pie-in-the-sky stuff.  Getting the memory leak
situation resolved is a bigger priority in my mind than any of this.
But if I had my way I think that having all AST objects be PyObjects
and then providing support for all three ways of getting access to the
AST (command-line, sys iterable, function for specific code object)
would be fantastic.


More information about the Python-Dev mailing list