[Python-3000] Brainstorming: Python Metaprogramming
Ian Bicking
ianb at colorstudy.com
Mon Apr 24 17:44:06 CEST 2006
Talin wrote:
> It seems that the history of the Python mailing lists are littered with the
> decayed corpses of various ideas related to "metaprogramming", that is, programs
> that write programs, either at compile time.
>
> We've seen proposals for C-style macros, Lisp-style macros, programmable syntax
> (guilty!), AST access, a first-class symbol type, and much more.
>
> Given how many times this has been suggested, I really do think that there is
> something there; At the same time, however, I recognize that all of these
> proposals are irrepairably flawed in one way or another.
>
> I think that much of the reason for this, is that the various proposals haven't
> really been distilled down to their absolute minimum essentials, which
> admittedly is a hard thing to do. Instead, they offer to import a whole,
> pre-existing architecture from some other language, which transforms Python into
> something that it is not.
>
> As an example, take Lisp-style macros. A macro in Lisp is a function that takes
> its arguments in unevaluated form. So when you call "mymacro( (add a b) ), you
> don't get the sum of a and b, you get a list consisting of three elements. The
> macro can then manipulate that list, and then evaluate it after it has been
> manipulated.
>
> The reason this is possible is because in Lisp, there's no difference between
> the AST (so to speak) and regular data.
>
> This fails in Python for two reasons:
>
> 1) For reasons of performance, the compiled code doesn't look very much like
> regular data, and is hard to manipulate.
>
> 2) Most of the things that you might want to do with Lisp macros you can already
> do in Python using some other technique.
>
> Using lambda, generators, operator overloading, and other Python features, we
> can effectively 'quote' a section of code or an algorithm, and manipulate it as
> data. No, you can't assemble arbitrary blocks of code, but most of the time you
> don't want to.
>
> Using overloaded Python operators, we can in fact do something very like the
> Lisp macro - that is, by replacing the '+' operator __add__, we can have it
> return an AST-like tree of objects, rather than carrying out an actual addition.
> However, this only works if you actually have control over the types being
> added. As we've seen in SQLObject, this limitation leads to some interesting
> syntactical contortions, where you need to insure that at least one of the two
> objects being added knows about the overloaded operator.
>
> So one question to ask is - what can the Lisp macro system do that is (a)
> useful, and (b) not already doable in Python, and (c) minimal enough that it
> wouldn't cause a major rethink of the language? And the same question can be
> asked for all of the other proposals.
>
> For some reason, I have stuck in my head the idea that this concept of 'quoting'
> is central to the whole business. In Lisp the term 'quote' means to supress
> evaluation of an item. So (quote a) returns to the symbol 'a', not the value
> stored in 'a'. It is the ability to refer to a thing that would normally be
> executed in its pre-executed state. In Python, we can already quote expressions,
> using lambda; we can quote loops, using generators; and so on.
>
> However, one piece that seems to be missing is the ability to quote references
> to global and local variables. In Python, the way to refer to a variable by name
> is to pass its name as a string. The problem with this, however, is that a
> string is a type in its own right, and has a whole different set of methods and
> behaviors than a variable reference.
>
> As a hypothetical example, supposed we defined a unary operator '?' such that:
>
> ?x
>
> was syntactic sugar for:
>
> quoted('x')
>
> or even:
>
> quoted('x', (scope where x is or would be defined) )
>
> Where 'quoted' was some sort of class that behaved like a reference to a
> variable. So ?x.set( 1 ) is the same as x = 1.
Sounds like lambda x: ...
> Moreover, you would want to customize all of the operators on quoted to return
> an AST, so that:
>
> ?x + 1
>
> produces something like:
>
> (add, quoted('x'), 1)
>
> ...or whatever data structure is convenient.
You can match the free variables from the lambda arguments against the
variables in the AST to get this same info.
> Of course, one issue that immediately comes to mind is, where does the class
> 'quoted' come from? Is it a globally defined class, or is it something that is
> defined for a given scope? What if you need to use two different definitions for
> 'quoted' within a single scope?
>
> For example, I can imagine that in the case of something like SQLObject, the
> 'quoted' class would transform into an SQL variable reference. But what if you
> wanted to use that along with the algebraic solver, which defines 'quoted' as
> something very different?
This isn't actually a problem currently. Right now you use
Table.q.column (or Table.column could have worked, had I been aware of
descriptors at the time). This has a lot more information in it than
just a name. Where it fails is un-overridable operators like "and".
> One idea, which is kind of strange (and impractical, but bear with me), is
> inspired by C++. When you define a member function in C++, the function's
> formal parameters are considered to be in the same scope as the function
> body. So for example, instead of having to write:
>
> void MyClass::function( MyClass::tListType &l );
>
> ...you can just write:
>
> void MyClass::function( tListType &l );
>
> ...because all of the contents of MyClass are visible within the argument list.
>
> So the strange idea is that when a function is *called*, the calling parameters
> are evaluated within a new, nested scope, that inherits from both the function
> itself, and the calling code's scope. This would allow the called function to
> overide the meaning of 'quoted' or other operators (in fact, it would solve the
> SQLObject problem if it allowed built-in operators such as '+' to be redefined.)
The C++ example is just lexical scoping of the identifiers in the
signature. I think what you are suggesting is considerably different,
because it involves changing the scoping of the actual runtime call.
> As an example, the algebraic solver could override the quote operator for its
> parameters:
>
> return solve( ?x + 0 )
> --> ?x
Well... for the SQL case I'm thinking that generators might be a way to
resolve this:
objs = SQL((person, address) for person, address in [Person, Address]
if person.address_id == address.id
and address.zip == '50555')
There's no natural way to order this, unfortunately, because "person"
and "address" aren't available outside the scope of the generator.
Instead it would require something rather lame, like "order_by=lambda p,
a: (a.zip, p.lname)", and considerable work to merge that in with the
generator (so that it could be executed in the database).
For equation solving a generator isn't particularly nice...
solve(x for x in RealNumbers
if x ** 2 + 5 * x - 4 == 0)
Well actually that isn't so bad...
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Python-3000
mailing list