[Python-3000] Proposal: Metasyntax operator

Thu Jul 20 10:58:22 CEST 2006

A number of dynamic languages, such as Lisp, support the notion of an 
'unevaluated' expression. In Lisp, a macro is simply a function that can 
operate on the *syntax* of the expression before it is actually compiled 
and interpreted.

A number of Python libraries attempt to use operator overloading to 
achieve similar results. A good example is SQLObject's 'sqlbuilder' 
module, which allows the user to construct SQL statements using regular 
Python operators.

Typically these libraries work by creating a set of standard constants 
(i.e. function names, symbols, etc.) and overloading all of the 
operators to produce an expression tree rather than actually evaluating 
the result. Thus, 'a + b', instead of producing the sum of a and b, will 
instead produce something along the lines of '('+', a, b)'.

One weak area, however, is in the treatment of variables. This is 
because you can't recover the name of a variable at runtime - the 
variable name only exists in the compiler's imagination, and is 
generally long gone by the time the program is actually executed.

There have been a number of attempts to get around this limitation. One 
approach is to define a limited set of "standard" variables (X, Y, Z, 
etc.) Unfortunately, this is quite limiting - having to pre-declare 
variables is hardly 'Pythonic'.

The other is to wrap the variables in a class (such as "Variable('x')", 
which is cumbersome. Various tricks with __getattr__ can also be done, 
such as Variable.x and so on.

All of these are examples of "quoted" variables - which is another way 
of saying that the variables are unevaluated, and that their syntactical 
  rather than semantic attributes are available to the program.

I'd like to propose a standard way to represent one of these syntactical 
variables. The syntax I would like to see is '?x' - i.e. a question mark 
followed by the variable name.

The reason for choosing the question mark is that this is exactly how 
many languages - including expert systems and inference engines - 
represent a substitution variable. The other reason, which was only of 
minor consideration, is that the '?' is one of the few symbols in Python 
whose meaning is not already assigned.

The actual meaning of ? would is very simple:

    '?x' is equivalent to '__quote__("x")'

Where __quote__ is a user-defined symbol accessible from the current scope.

There will also be a standard, importable implementation of __quote__ 
which overloads all operators to create a simple AST. It is likely that 
most code that does syntactic manipulation (such as sqlbuilder) could be 
modified to use the standard AST created by the built-in quote class. 
The reason for this is because the AST itself has no inherent meaning, 
its only meaningful to the function that you pass it so. So the SQL 
'select()' function could walk through the AST transforming it into SQL 
syntax.

The quote operator has one other effect, which is that unlike regular 
variables it can overload assignment:

	?x = 3

Normally this would be an error, since ?x isn't an L-value. However the 
interpretr notes the presence of the ? and allows the __assign__ method 
to be called instead. A possible use for this would be to create 
syntactical descriptions of keyword arguments that can be inspected at 
runtime:

    @overload(int,int,?x=int)

Now, one potential objection is what to do about conflicting definitions 
of __quote__. My response is that it is up to the user to make sure that 
__quote__ is defined to the correct class at the correct places in the 
code, even if this means re-assigning it from time to time:

	__quote__ = SQLVar
	expr = Select( User.name, Where( ?name = 'Fred' ) )
	__quote__ = MathVar
	formula = ?a + ?b

(I should note that I don't expect this proposal to get very far. 
However, its something I've been thinking about for a long time.)

-- Talin