* Not using an arena allocator for the nodes can introduce more challenges than simplifications. The first is that deleting a deep tree currently is just freeing the arena block, while if the nodes were PyObjects it will involve recursive destruction. That could potentially segfault so we would need to use some custom trashcan mechanism of special deleters. All of this will certainly not simplify the code (at least the parser code) and will impact performance (although just in the parser/compiler phase).

* We would need to (potentially) reimplement the AST sequences into proper owning-containers. That will involve changing a considerable amount of code and some slowdown due to having to use C-API calls.

There is probably another way. We already have code to convert between the C-level AST (the one that's arena-allocated) and the Python-level AST (the one that the `ast` module provides). Mark doesn't seem to mind if processing macros slows down parsing (since .pyc file caching still works). So we could convert the C-level AST to a Python-level AST, give that to the macro processor, which returns another Python-level AST, and then we convert that back to a C-level AST that we graft into the parse tree for the source being parsed

Given that macros can return more macros I fear this process may be prohibitively slow because not only you need to create the python structures at each time, but also you would need to delete them as well and that will be always slower. How slow remains of course to be seen, but given that the chain can be unbounded is enough to worry me a little.

Even if we make the argument that is fine because this is a compile phase, there are a considerable amount of cases where compilation may happen at runtime, so although is not that critical as something like the eval loop, it may become a potential concern.