On Tue, Apr 21, 2020 at 9:35 PM Gregory P. Smith <greg@krypto.org> wrote:
Could we go ahead and mark lib2to3 as Pending Deprecation in 3.9 so we can get it out of the stdlib by 3.11 or 3.12?
I'm going ahead and tracking the idea in https://bugs.python.org/issue40360.
lib2to3 is the basis of all sorts of general source code manipulation tooling. Its name and original reason d'etre have moved on. It is actively used to parse and rewrite Python 3 code all the time. yapf uses it, black uses a fork of it. Other Python code manipulation tooling uses it. Modernize like fixers are useful for all sorts of cleanups.
IMNSHO it would be better if lib2to3 were *not* in the stdlib anymore - Black already chose to fork lib2to3 <https://github.com/psf/black/tree/master/blib2to3>. So given that it is eventually not going to be able to parse future syntax, the better answer seems like deprecation, putting the final version up on PyPI and letting any descendants of it live on PyPI where they can get more active care than a stdlib module ever does.
-gps
On Tue, Apr 21, 2020 at 6:58 PM Guido van Rossum <guido@python.org> wrote:
Great! Please submit a PR to update the [lib]2to3 docs and CC me (@gvanrossum).
While perhaps it wouldn't hurt if the PEP mentioned lib2to3, it was just accepted by the Steering Council without such language, and I wouldn't want to imply that the SC agrees with everything I said. So I still think we ought to deal with lib2to3 independently (and no, it won't need its own PEP :-). A reasonable option would be to just deprecate it and recommend people use parso, LibCST or something else (I wouldn't recommend pegen in its current form yet).
On Tue, Apr 21, 2020 at 6:21 PM Carl Meyer <carl@oddbird.net> wrote:
On Sat, Apr 18, 2020 at 10:38 PM Guido van Rossum <guido@python.org> wrote:
Note that, while there is indeed a docs page about 2to3, the only docs
for lib2to3 in the standard library reference are a link to the source code and a single "Note: The lib2to3 API should be considered unstable and may change drastically in the future."
Fortunately, in order to support the 2to3 application, lib2to3
doesn't need to change, because the syntax of Python 2 is no longer changing. :-) Choosing to remove 2to3 is an independent decision. And lib2to3 does not depend in any way on the old parser module. (It doesn't even use the standard tokenize module, but incorporates its own version that is slightly tweaked to support Python 2.)
Indeed! Thanks for clarifying, I now recall that I already knew it doesn't, but forgot.
The docs page for 2to3 does currently say "lib2to3 could also be adapted to custom applications in which Python code needs to be edited automatically." Perhaps at least this sentence should be removed, and maybe also replaced with a clearer note that lib2to3 not only has an unstable API, but also should not necessarily be expected to continue to parse future Python versions, and thus building tools on top of it should be discouraged rather than recommended. (Maybe even use the word "deprecated.") Happy to submit a PR for this if you agree it's warranted.
It still seems to me that it wouldn't hurt for PEP 617 itself to also mention this shift in lib2to3's effective status (from "available but no API stability guarantee" to "probably will not parse future Python versions") as one of its indirect effects.
You've mentioned a few different tools that already use different technologies: LibCST depends on parso which has a fork of pgen2, lib2to3 which has the original pgen2. I wonder if this would be an opportunity to move such parsing support out of the standard library completely. There are already two versions of pegen, but neither is in the standard library: there is the original pegen repo which is where things started, and there is a fork of that code in the CPython Tools directory (not yet in the upstream repo, but see PR 19503).
The pegen tool has two generators, one generating C code and one generating Python code. I think that the C generator is really only relevant for CPython itself: it relies on the builtin tokenizer (the one written in C, not the stdlib tokenize.py) and the generated C code depends on many internal APIs. In fact the C generator in the original pegen repo doesn't work with Python 3.9 because those internal APIs are no longer exported. (It also doesn't work with Python 3.7 or older because it makes critical use of the walrus operator. :-) Also, once we started getting serious about replacing the old parser, we worked exclusively on the C generator in the CPython Tools directory, so the version in the original pegen repo is lagging quite a bit behind (is is the Python grammar in that repo). But as I said you're not gonna need it.
On the other hand, the Python generator is designed to be flexible, and while it defaults to using the stdlib tokenize.py tokenizer, you can easily hook up your own. Putting this version in the stdlib would be a mistake, because the code is pretty immature; it is really waiting for a good home, and if parso or LibCST were to decide to incorporate a fork of it and develop it into a high quality parser generator for Python-like languages that would be great. I wouldn't worry much about the duplication of code -- the Python generator in the CPython Tools directory is only used for one purpose, and that is to produce the meta-parser (the parser for grammars) from the meta-grammar. And I would happily stop developing the original pegen once a fork is being developed.
Thanks, this is all very clarifying! I hadn't even found the original gvanrossum/pegen repo, and was just looking at the CPython PR for PEP 617. Clearly I haven't been following this work closely.
Another option would be to just improve the python generator in the original pegen repo to satisfy the needs of parso and LibCST. Reading the blurb for parso it looks like it really just parses *Python*, which is less ambitious than pegen. But it also seems to support error recovery, which currently isn't part of pegen. (However, we've thought about it.) Anyway, regardless of how exactly this is structured someone will probably have to take over development and support. Pegen started out as a hobby project to educate myself about PEG parsers. Then I wrote a bunch of blog posts about my approach, and finally I started working on using it to generate a replacement for the old pgen-based parser. But I never found the time to make it an appealing parser generator tool for other languages, even though that was on my mind as a future possibility. It will take some time to disentangle all this, and I'd be happy to help someone who wants to work on this.
This seems like the place to start. When we start work on Python 3.10 support for LibCST, we can start with trying to use and adapt pegen in place of the vendored fork of parso we currently use, and if that's promising enough, consider taking over maintenance of it.
Carl
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RDV3GAL2... Code of Conduct: http://python.org/psf/codeofconduct/