[Python-Dev] [compatibility-sig] do all VMs implement the ast module? (was: Re: AST optimizer implemented in Python)

Tue Aug 14 01:31:47 CEST 2012

On Mon, Aug 13, 2012 at 6:35 PM, fwierzbicki at gmail.com <
fwierzbicki at gmail.com> wrote:

> On Mon, Aug 13, 2012 at 3:10 PM, Brett Cannon <brett at python.org> wrote:
>
> > Direct. There is an AST grammar file that gets compiled into C and Python
> > objects which are used by the compiler (c version) or exposed to users
> > (Python version).
> At the risk of making you repeat yourself, and just to be sure I
> understand: There are C objects used by the compiler and Python
> objects that are exposed to the users (written in C though) that are
> generated by the AST grammar.

Both sets of objects are generated from the grammar. It's wrapping some C
structs (the C version of the AST) in an extension module where the fields
of the struct and the names of the types are all the same (the Python
version) no matter if it is C or Python. Converting between the two is just
a matter of allocating memory and copying data from one struct to another.

> That at least sounds like they are
> different.

Are you asking if we pass the objects through transparently, or if they are
just the same API? The are the same API since the AST nodes used by the
compiler just have an extension exposing them that has the same names,
fields, etc. But to expose the API to Python code the C-level objects are
taken, pulled apart, and used to populate and exact API copy of them as
Python object (i.e. the ast2obj_* functions defined in Python/Python-ast.c).

To try to make this really clear, consider the Assign node type. At the C
level it's just a struct::

                struct {
                        asdl_seq *targets;
                        expr_ty value;
                } Assign;

An asdl_seq is just an array of AST nodes. So converting to Python code is
just a matter of allocating the equivalent Assign_type (which is a
PythonTypeObject), and then populating its 'targets' attribute with a list
of the AST nodes and its expr instance for its 'value' type. It's all very
mechanical since it is all code-generated.

> The last I checked the grammar was Python.asdl and the
> translater was asdl_c.py resulting in /Python/Python-ast.c which looks
> like it is the implementation of _ast.py
>

There is no _ast.py, only Lib/ast.py which just provides helper code for
working with the AST (e.g. a NodeVisitor class). The builtin _ast module
comes from Python/Python-ast.c.

>
> Are the AST objects from Python-ast.c used by the compiler? And what
> is the relationship between Python-ast.c and /Python/ast.c? And what
> about the CST mentioned at the top of /Python/ast.c?
>

http://docs.python.org/devguide/compiler.html explains it all.

>
> I ask all of this because I want to be sure that separating the
> internal AST in Jython from the one exposed in ast.py is really a good
> idea. If CPython does not make this distinction that will be a strike
> against the idea.
>

As I said, depends if you mean API or actual objects. The compiler itself
uses C objects which are nothing more than structs and unions. The AST
exposed by the _ast module uses the same names, fields, etc., but are
actual Python objects instead of structs and unions. The separation allows
the compiler to save on memory costs by only using structs instead of a
complete PyObject struct which would have tons of stuff that the compiler
doesn't need (e.g. the AST has no methods so why waste memory on PyObject
allocation for method slots that will never be set?).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120813/72c8466f/attachment.html>