Metasyntax/Macros

Bengt Richter bokr at oz.net
Tue May 27 18:45:50 EDT 2003


On 27 May 2003 12:00:24 -0700, paddy3118 at netscape.net (Paddy McCarthy) wrote:

>alloydflanagan at attbi.com (A. Lloyd Flanagan) wrote in message news:<a8b7f07a.0305141356.344f0e69 at posting.google.com>...
>> paddy3118 at netscape.net (Paddy McCarthy) wrote in message news:<2ae25c6b.0305140003.dbd812e at posting.google.com>...
>> <excerpt>
>> > and create a Python class. By choosing a careful set of macros the
>> > resultant {ython-with-macros source could be understandable to people
>> > with only the domain specific language knowledge,
>> > 
>> </expert>
>> 
>> And only those people.  The trouble with macros in general, and C
>> macros in particular, is that they can be used to mangle the language
>> into almost any form the macro-writer wants.  The result is something
>> that does not make sense to any other person on the planet.
>> 
>> IMHO, we already have plenty of languages that are ideal for writing
>> unreadable code.  Let's not turn Python into one of them.
>> 
>> If you want a language with a powerful designed-in macro facility,
>> check out Lisp.
>
>But that *is* the point. I don't want a Lisp or a C or a Rebol macro
>facility.
>I just have not closed my mind to the thought of macros within Python.
>I just haven't thought of a pythonic way of doing it.
>
>Any implimentation I guess would have to include modification and
>introspection of the parser so something like help(__statements__) and
>help(__operators__) would print the docstrings of statements together
>with details on how the statement is parsed, or the associativity etc
>of operators (including the built-in ones).
>
>As for producing code that no one can read: Yes! Documentation and
>introspection must be a part of a good Python macro system but I think
>it is analogous to using a package. Unless you read-up about it, you
>don't know what a package does. Python helps by having things like
>help(). For any proposed macro solution Python would need appropriate
>help and debugging extensions!
>
>Maybe the default behavior when in interactive mode could be for each
>macro instance to have its expansion printed when it is encountered.
>This default could be then be optionally turned off.

It just occurred to me that the tokenizer separates out string tokens
before anything is compiled, so if there were a special kind of string
syntax like raw string syntax but with, say, '$' as a prefix instead of
'r', then that string could be treated specially at tokenization time.

What I have in mind is having the tokenizer recognize a $'xxx' string token,
but not pass it on as a string token in its output stream of tokens.

Instead, the tokenizer would effectively substitute a generated string in place
of the """$'xxx'""" source and proceed to tokenize _that_ instead.

The substitution string would be generated by exec-ing the 'xxx' part in a special
dict, but with a tweaked exec mode that would return the string value of the
last expression evaluated (this is so we don't have to assign to a dummy variable,
or limit ourselves to expressions).

If the last expression was not a string, its repr value would be returned.

So you could write, e.g., 

    a = $'b'

and the tokenizer would do the specially tweaked exec on the string 'b' in the
aformentioned special environment (sys.tokenenvs.<filename> ?) [Note 1], such
that the string value of the last expression (in this case just whatever is bound to b)
would be processed as source in the place of the $'b'. If b was not bound to a string,
repr(b) would be substituted. Thus if you wrote

    a = $'[2*3]'

the expression [2*3] would be evaluated to get [6] and the repr would be '[6]'
and that string would be substituted in place of the """$'[2*3]'""" source so
that the net input to the tokenizer that had just produced 'a' and '=' tokens
would now see '[6]' and produce the three tokens '[', '6', and ']' (with associated
tupled info)

Because the $-string is exec-ed, it is possible to write stuff with side effects
that could be picked up later. Or, e.g., $"""file("include.py").read()""" would
stream the bytes of include.py right into the tokenizer in place of the $-string source.

Or if you wanted a read-time time stamp in your source, you could write
    read_time = $"""import time; `time.ctime()`"""

which would have a result as if the source had been something like

    read_time = 'Tue May 27 15:35:19 2003'

[Note 1] Having a potentially pre-loaded dict to exec in would permit persistent side
effects that could be referred to in a subsequent $'some_string'. Having the first $'xxx'
string in a file source.py create a persistent dict as sys.tokenenvs.source by default
and preloading it at least with 
  {'__filename__': <the path to the source.py currently being tokenized>}
would allow some interesting things.

Extending the syntax so that $name'xxx'' would exec 'xxx' in sys.tokenenvs.name
rather than using the current source file name would allow you to write e.g.,
$project_xxx"""build_id()""" or such.

Of course this would be something to limit for security purposes, but it seems interesting.
The tokenizer would have to be recursively reentrant though, I guess ...

Just HOTTOMH, so fire away ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list