> Actually, I think the `ast` module doesn't work very well for formatters, because it loses comments. (Retaining comments and all details of whitespace is the specific use case for which I created pgen2.)
Some uses I have seen include using it to check that the code before and after the formatting has no functional changes (both have the same ast) or to augment the information obtained with other sources. But yeah, I agree that static code analyzers and linters are a much bigger target.
>I wonder if lib2to3 is actually something that would benefit from moving out of the stdlib. (Wasn't it on Amber's list?) As Łukasz points out in that issue, it is outdated. Maybe if it was out of the stdlib it would attract more contributors. Then again, I have recently started exploring the idea of a PEG parser for Python. Maybe a revamped version of the core of lib2to3 based on PEG instead of pgen would be interesting to some folks.
I was thinking more on the line of leveraging some parts lib2to3 having some CST-related solution similar to the ast module, not exposing the whole functionality of lib2to3. Basically, it would be a more high-level abstraction to substitute the current parser module. Technically you should be able to reconstruct some primitives that lib2to3 uses on top of the output that the parser module generates (modulo some extra information from the grammar), but the raw output that the parser module generates is not super useful by itself, especially when you consider the maintenance costs.
On the other side, as you mention here:
>I am interested in switching CPython's parsing strategy to something else (what exactly remains to be seen) and any new approach is unlikely to reuse the current CST technology. (OTOH I think it would be wise to keep the current AST.)
it is true that changing the parser can influence greatly the hypothetical CST module so it may complicate the conversion to a new parser solution if the API does not abstract enough (or it may be close to impractical depending on the new parser solution).
My original suggestion was based on the fact that the parser module is not super useful and it has a great maintenance cost, but the "realm" of what it solves (providing access to the parse trees) could be useful to some use cases so that is why I was talking about "parser" and lib2to3 in the same email.
Perhaps we can be more productive if we focus on just deprecating the "parser" module, but I thought it was an opportunity to solve two (related) problems at once.