I'm excited about the potential introduction of Lisp-style syntactic macros into Python. ๐ It's unclear to me whether PEP 638 ("Syntactic Macros") is still being actively developed, since the last activity I see on it is over a year ago (Sep & Oct 2020), but I thought I'd leave some initial comments I have anyway. Quoting from the PEP text... (1)
Lexical analysis ~~~~~~~~~~~~~~~~
Any sequence of identifier characters followed by an exclamation point (exclamation mark, UK English) will be tokenized as a ``MACRO_NAME``.
+1 to using the ! character to mark macros explicitly and loudly, since they are very powerful and can completely customize the syntax used inside them. I like the postfix syntax myself (as proposed) - so `macro_name!` is cool. I wouldn't be a fan of prefix syntax; `!macro_name` looks ugly. (2)
Statement form ~~~~~~~~~~~~~~
macro_stmt = MACRO_NAME testlist [ "import" NAME ] [ "as" NAME ] [ ":" NEWLINE suite ]
(2.1) What is `testlist`? It is not defined in this PEP, nor is it defined in the Python Grammar Specification [1]. Also, the `[ "import" NAME ]` and `[ "as" NAME ]` parts appear to be special-purpose syntax only useful for supporting the `from!` and `with!` macros mentioned later in the PEP. It feels odd that this syntax isn't more general-purpose. Perhaps the grammar could be made more general with something like:
macro_stmt = MACRO_NAME macro_expr_parameters ( NEWLINE | ":" NEWLINE suite ) macro_expr_parameters = ( expression | <keyword> )*
That would allow a statement macro to take some number of expressions and keywords as arguments, in addition to a suite. (2.2) +0 to the idea of prefixing a @ to a MACRO_NAME that is intended to be used as a sibling-macro. That would look like:
@do_nothing_marker! def foo(...): ...
(3)
Expression form ~~~~~~~~~~~~~~~
macro_expr = MACRO_NAME "(" testlist ")"
Again, what is `testlist`? Perhaps you're looking for something more like:
macro_expr = MACRO_NAME "(" macro_parameters ")" macro_parameters = ( expression ( "," expression )* ","? )?
(4)
Resolving ambiguity ~~~~~~~~~~~~~~~~~~~
The statement form of a macro takes precedence, so that the code ``macro_name!(x)`` will be parsed as a macro statement, not as an expression statement containing a macro expression.
It seems to me that if you were to define an expression macro - like `cast!(T, expression)` - that users would expect to be able to call such an expression macro on a line by itself. The rule stated here suggests that the parser would get confused by such a call, incorrectly treating the call as a call of a macro *statement* rather than a macro *expression*. (5)
Compilation ~~~~~~~~~~~
For macros with multiple names, [...]
Nit: "Multiple names"? I think you meant macros with "additional names", as described later in the PEP. (6)
Defining macro processors ~~~~~~~~~~~~~~~~~~~~~~~~~
A macro processor is defined by a four-tuple, consisting of ``(func, kind, version, additional_names)``:
* [...] * ``additional_names`` are the names of the additional parts of the macro, and must be a tuple of strings.
Seems to me that "additional_names" might make more sense to call "additional clause names" or just "clause names". That would be consistent with the following "anatomy of a statement-macro" sketch: try_!: # begin try_ statement; is try_ clause header ... # is try_ clause body finally_!: # is finally_ clause header ... # is finally_ clause body; end of try_ statement (7)
Hygiene and debugging ~~~~~~~~~~~~~~~~~~~~~
[...] No rules for naming will be enforced, but to ensure hygiene and help debugging, the following naming scheme is recommended: [...]
This appears to imply that it is up to the macro processor author to take special care that their macro is hygenic. That is, *unhygenic* macros are the default. Wouldn't it be safer to define a system where *hygenic* macros would be the default instead? Seems like if we can head off the introduction of Command Injection by default at the design level then we should do so. (8)
Examples ''''''''
Not a single example in this section defines an actual macro processor function. Recommend implementing at least a toy macro end-to-end (including the macro processor function) for full clarity, for each type of macro (i.e. statements, sibling, and expression). (9)
Backwards Compatibility =======================
This PEP is fully backwards compatible.
Nit: In the ยง"Implementation" section below, it is mentioned that nodes in the`_ast` module would be made immutable. That sounds like a backward-incompatible change to me. (10)
Currently, all AST nodes are allocated using an arena allocator. Changing to use the standard allocator might slow compilation down a little, but has advantages in terms of maintenance, as much code can be deleted.
I presume the arena allocator was introduced in the first place for a reason. Perhaps to improve performance? By removing the arena allocator are there potential downsides other than a performance regression? (11)
Reference Implementation ''''''''''''''''''''''''
None as yet.
Seems like you could get a prototype off the ground by implementing an initial version as a fake Python source file text encoding. Then you could put something like `# coding=macros` at the top of a source file to have it preprocessed by a prototype macro system. (12) Thanks for taking the time to read my comments. -- David Foster | Seattle, WA, USA Contributor to mypy, TypedDict, and Python's type system [1]: https://docs.python.org/3/reference/grammar.html