[Python-Dev] AST Transformation Hooks for Domain Specific Languages

David Malcolm dmalcolm at redhat.com
Fri Apr 8 18:50:50 CEST 2011


On Fri, 2011-04-08 at 21:29 +1000, Nick Coghlan wrote:
> A few odds and ends from recent discussions finally clicked into
> something potentially interesting earlier this evening. Or possibly
> just something insane. I'm not quite decided on that point as yet (but
> leaning towards the latter).

I too am leaning towards the latter (I'm afraid my first thought was to
check the date on the email); as Michael said, I too don't think it
stands much of a chance in core.

> Anyway, without further ado, I present:
> 
> AST Transformation Hooks for Domain Specific Languages
> ======================================================

This reminds me a lot of Mython:
  http://mython.org/
If you haven't seen it, it's well worth a look.

My favourite use case for this kind of thing is having the ability to
embed shell pipelines into Python code, by transforming bash-style
syntax into subprocess calls (it's almost possible to do all this in
regular Python by overloading the | and > operators, but not quite).

> Consider:
> 
> # In some other module
> ast.register_dsl("dsl.sql", dsl.sql.TransformAST)

Where is this registered?   Do you have to import this "other module"
before importing the module using "dsl.sql" ?   It sounds like this is
global state for the interpreter.

> # In a module using that DSL

How is this usage expressed?  via the following line?

> import dsl.sql

I see the "import dsl.sql" here, but surely you have to somehow process
the "import" in order to handle the rest of the parsing.

This is reminiscent of the "from __future__ " specialcasing in the
parser.  But from my understanding of CPython's Python/future.c, you
already have an AST at that point (mod_ty, from Python/compile.c).
There seems to be a chicken-and-egg problem with this proposal.

Though another syntax might read:

  from __dsl__ import sql

to perhaps emphasize that something magical is about to happen.

[...snip example of usage of a DSL, and the AST it gets parsed to...]

Where and how would the bytes of the file usage the DSL get converted to
an in-memory tree representation?  

IIRC, manipulating AST nodes in CPython requires some care: the parser
has its own allocator (PyArena), and the entities it allocates have a
shared lifetime that ends when PyArena_Free occurs.

> So there you are, that's the crazy idea. The stoning of the heretic
> may now commence :)

Or, less violently, take it to python-ideas?  (though I'm not subscribed
there, fwiw, make of that what you will)

One "exciting" aspect of this is that if someone changes the DSL file,
the meaning of all of your code changes from under you.  This may or may
not be a sane approach to software development :)

(I also worry what this means e.g. for people writing text editors,
syntax highlighters, etc; insert usual Alan Perlis quote about syntactic
sugar causing cancer of the semicolon)

Also, insert usual comments about the need to think about how
non-CPython implementations of Python would go about implementing such
ideas.

> Where this idea came from was the various discussions about "make
> statement" style constructs and a conversation I had with Eric Snow at
> Pycon about function definition time really being *too late* to do
> anything particularly interesting that couldn't already be handled
> better in other ways. Some tricks Dave Malcolm had done to support
> Python level manipulation of the AST during compilation also played a
> big part, as did Eugene Toder's efforts to add an AST optimisation
> step to the compilation process.

Like I said earlier, have a look at Mython

Hope this is helpful
Dave



More information about the Python-Dev mailing list