[Python-ideas] AST Transformation Hooks for Domain Specific Languages

Nick Coghlan ncoghlan at gmail.com
Fri Apr 8 13:32:47 CEST 2011


(Oops, let's try that again with the correct destination address this time...)

A few odds and ends from recent discussions finally clicked into
something potentially interesting earlier this evening. Or possibly
just something insane. I'm not quite decided on that point as yet (but
leaning towards the latter).

Anyway, without further ado, I present:

AST Transformation Hooks for Domain Specific Languages
======================================================

Consider:

# In some other module
ast.register_dsl("dsl.sql", dsl.sql.TransformAST)

# In a module using that DSL
import dsl.sql
def lookup_address(name : dsl.sql.char, dob : dsl.sql.date) from dsl.sql:
   select address
   from people
   where name = {name} and dob = {dob}


Suppose that the standard AST for the latter looked something like:

   DSL(syntax="dsl.sql",
       name='lookup_address',
       args=arguments(
           args=[arg(arg='name',
                     annotation=<Normal AST for "dsl.sql.char">),
                 arg(arg='dob',
                     annotation=<Normal AST for "dsl.sql.date">)],
           vararg=None, varargannotation=None,
           kwonlyargs=[], kwarg=None, kwargannotation=None,
           defaults=[], kw_defaults=[]),
       body=[Expr(value=Str(s='select address\nfrom people\nwhere
name = {name} and dob = {dob}'))],
       decorator_list=[],
       returns=None)

(For those not familiar with the AST, the above is actually just the
existing Function node with a "syntax" attribute added)

At *compile* time (note, *not* function definition time), the
registered AST transformation hook would be invoked and would replace
that DSL node with "standard" AST nodes.

For example, depending on the design of the DSL and its support code,
the above example might be equivalent to:

   @dsl.sql.escape_and_validate_args
   def lookup_address(name: dsl.sql.char, dob: dsl.sql.date):
      args = dict(name=name, dob=dob)
      query = "select address\nfrom people\nwhere name = {name} and
dob = {dob}"
      return dsl.sql.cursor(query, args)


As a simpler example, consider something like:

   def f() from all_nonlocal:
       x += 1
       y -= 2

That was translated at compile time into:

   def f():
       nonlocal x, y
       x += 1
       y -= 2

My first pass at a rough protocol for the AST transformers suggests
they would only need two methods:

 get_cookie() - Magic cookie to add to PYC files containing instances
of the DSL (allows recompilation to be forced if the DSL is updated)
 transform_AST(node) - a DSL() node is passed in, expected to return
an AST containing no DSL nodes (SyntaxError if one is found)

Attempts to use an unregistered DSL would trigger SyntaxError

So there you are, that's the crazy idea. The stoning of the heretic
may now commence :)

Where this idea came from was the various discussions about "make
statement" style constructs and a conversation I had with Eric Snow at
Pycon about function definition time really being *too late* to do
anything particularly interesting that couldn't already be handled
better in other ways. Some tricks Dave Malcolm had done to support
Python level manipulation of the AST during compilation also played a
big part, as did Eugene Toder's efforts to add an AST optimisation
step to the compilation process.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list