
The PEP is exciting and is very clearly presented, thank you all for the hard work! Considering the comments in the PEP about the new parser not preserving a parse tree or CST, I have some questions about the future options for Python language-services tooling which requires a CST in order to round-trip and modify Python code. Examples in this space include auto-formatters, refactoring tools, linters with autofix, etc. Today many such tools (e.g. Black, 2to3) are based on lib2to3. Other tools already have their own parser (e.g. LibCST -- which I help maintain -- and Jedi both use parso, a fork of pgen2). 1) 2to3 and lib2to3 are not mentioned in the PEP, but are a documented part of the standard library used by some very popular tools, and currently depend on pgen2. A quick search of the PEP 617 pull request does not suggest that it modifies lib2to3. Will lib2to3 also be removed in Python 3.10 along with the old parser? It might be good for the PEP to address the future of 2to3 and lib2to3 explicitly. 2) As these tools make the necessary adaptations to support Python 3.10, which may no longer be parsable with an LL(1) parser, will we be able to leverage any part of pegen to construct a lossless Python CST, or will we likely need to fork pegen outside of CPython or build a wholly new parser? It would be neat if an alternate grammar could be written in pegen that has access to all tokens (including NL and COMMENT) for this purpose; that would save a lot of code duplication and potential for inconsistency. I haven't had a chance to fully read through the PEP 617 pull request, but it looks like its tokenizer wrapper currently discards NL and COMMENT. I understand this is a distinct use case with distinct needs and I'm not suggesting that we should make significant sacrifices in the performance or maintainability of pegen to serve it, but if it's possible to enable some sharing by making API choices now before it's merged, that seems worth considering. Carl