[Python-3000] Brainstorming: Python Metaprogramming

Mon Apr 24 17:44:06 CEST 2006

Talin wrote:
> It seems that the history of the Python mailing lists are littered with the
> decayed corpses of various ideas related to "metaprogramming", that is, programs
> that write programs, either at compile time.
> 
> We've seen proposals for C-style macros, Lisp-style macros, programmable syntax
> (guilty!), AST access, a first-class symbol type, and much more.
> 
> Given how many times this has been suggested, I really do think that there is
> something there; At the same time, however, I recognize that all of these
> proposals are irrepairably flawed in one way or another.
> 
> I think that much of the reason for this, is that the various proposals haven't
> really been distilled down to their absolute minimum essentials, which
> admittedly is a hard thing to do. Instead, they offer to import a whole,
> pre-existing architecture from some other language, which transforms Python into
> something that it is not.
> 
> As an example, take Lisp-style macros. A macro in Lisp is a function that takes
> its arguments in unevaluated form. So when you call "mymacro( (add a b) ), you
> don't get the sum of a and b, you get a list consisting of three elements. The
> macro can then manipulate that list, and then evaluate it after it has been
> manipulated.
> 
> The reason this is possible is because in Lisp, there's no difference between
> the AST (so to speak) and regular data.
> 
> This fails in Python for two reasons:
> 
> 1) For reasons of performance, the compiled code doesn't look very much like
> regular data, and is hard to manipulate.
> 
> 2) Most of the things that you might want to do with Lisp macros you can already
> do in Python using some other technique.
> 
> Using lambda, generators, operator overloading, and other Python features, we
> can effectively 'quote' a section of code or an algorithm, and manipulate it as
> data. No, you can't assemble arbitrary blocks of code, but most of the time you
> don't want to.
> 
> Using overloaded Python operators, we can in fact do something very like the
> Lisp macro - that is, by replacing the '+' operator __add__, we can have it
> return an AST-like tree of objects, rather than carrying out an actual addition.
> However, this only works if you actually have control over the types being
> added. As we've seen in SQLObject, this limitation leads to some interesting
> syntactical contortions, where you need to insure that at least one of the two
> objects being added knows about the overloaded operator.
> 
> So one question to ask is - what can the Lisp macro system do that is (a)
> useful, and (b) not already doable in Python, and (c) minimal enough that it
> wouldn't cause a major rethink of the language? And the same question can be
> asked for all of the other proposals.
> 
> For some reason, I have stuck in my head the idea that this concept of 'quoting'
> is central to the whole business. In Lisp the term 'quote' means to supress
> evaluation of an item. So (quote a) returns to the symbol 'a', not the value
> stored in 'a'. It is the ability to refer to a thing that would normally be
> executed in its pre-executed state. In Python, we can already quote expressions,
> using lambda; we can quote loops, using generators; and so on.
> 
> However, one piece that seems to be missing is the ability to quote references
> to global and local variables. In Python, the way to refer to a variable by name
> is to pass its name as a string. The problem with this, however, is that a
> string is a type in its own right, and has a whole different set of methods and
> behaviors than a variable reference.
> 
> As a hypothetical example, supposed we defined a unary operator '?' such that:
> 
>     ?x
> 
> was syntactic sugar for:
> 
>     quoted('x')
> 
> or even:
> 
>     quoted('x', (scope where x is or would be defined) )
> 
> Where 'quoted' was some sort of class that behaved like a reference to a
> variable. So ?x.set( 1 ) is the same as x = 1.

Sounds like lambda x: ...

> Moreover, you would want to customize all of the operators on quoted to return
> an AST, so that:
> 
>     ?x + 1
> 
> produces something like:
> 
>     (add, quoted('x'), 1)
> 
> ...or whatever data structure is convenient.

You can match the free variables from the lambda arguments against the 
variables in the AST to get this same info.

> Of course, one issue that immediately comes to mind is, where does the class
> 'quoted' come from? Is it a globally defined class, or is it something that is
> defined for a given scope? What if you need to use two different definitions for
> 'quoted' within a single scope?
> 
> For example, I can imagine that in the case of something like SQLObject, the
> 'quoted' class would transform into an SQL variable reference. But what if you
> wanted to use that along with the algebraic solver, which defines 'quoted' as
> something very different?

This isn't actually a problem currently.  Right now you use 
Table.q.column (or Table.column could have worked, had I been aware of 
descriptors at the time).  This has a lot more information in it than 
just a name.  Where it fails is un-overridable operators like "and".

> One idea, which is kind of strange (and impractical, but bear with me), is
> inspired by C++. When you define a member function in C++, the function's
> formal parameters are considered to be in the same scope as the function
> body. So for example, instead of having to write:
> 
>     void MyClass::function( MyClass::tListType &l );
> 
> ...you can just write:
> 
>     void MyClass::function( tListType &l );
> 
> ...because all of the contents of MyClass are visible within the argument list.
> 
> So the strange idea is that when a function is *called*, the calling parameters
> are evaluated within a new, nested scope, that inherits from both the function
> itself, and the calling code's scope. This would allow the called function to
> overide the meaning of 'quoted' or other operators (in fact, it would solve the
> SQLObject problem if it allowed built-in operators such as '+' to be redefined.)

The C++ example is just lexical scoping of the identifiers in the 
signature.  I think what you are suggesting is considerably different, 
because it involves changing the scoping of the actual runtime call.

> As an example, the algebraic solver could override the quote operator for its
> parameters:
> 
>    return solve( ?x + 0 )
>    --> ?x

Well... for the SQL case I'm thinking that generators might be a way to 
resolve this:

objs = SQL((person, address) for person, address in [Person, Address]
             if person.address_id == address.id
                and address.zip == '50555')

There's no natural way to order this, unfortunately, because "person" 
and "address" aren't available outside the scope of the generator. 
Instead it would require something rather lame, like "order_by=lambda p, 
a: (a.zip, p.lname)", and considerable work to merge that in with the 
generator (so that it could be executed in the database).

For equation solving a generator isn't particularly nice...

     solve(x for x in RealNumbers
           if x ** 2 + 5 * x - 4 == 0)

Well actually that isn't so bad...

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org