[Python-Dev] Put token information in one place

Brett Cannon brett at python.org
Wed May 31 13:58:10 EDT 2017


On Wed, 31 May 2017 at 04:01 Serhiy Storchaka <storchaka at gmail.com> wrote:

> Currently when you add a new token you need to change a couple of files:
>
> * Include/token.h
> * _PyParser_TokenNames in Parser/tokenizer.c
> * PyToken_OneChar(), PyToken_TwoChars() or PyToken_ThreeChars() in
> Parser/tokenizer.c
> * Lib/token.py (generated from Include/token.h)
> * EXACT_TOKEN_TYPES in Lib/tokenize.py
> * Operator, Bracket or Special in Lib/tokenize.py
> * Doc/library/token.rst
>
> It is possible to generate all this information from a single source.
> Proposed in [1] patch uses Lib/token.py as an initial source. But maybe
> Lib/token.py also should be generated from some file in general format?
>

I don't think it matters really. Whatever is most convenient.


> Some information can be derived from Grammar/Grammar, but not all.
> Needed also a mapping between token strings ('(' or '>=') and names
> (LPAR, GREATEREQUAL). Can this be added in Grammar/Grammar or a new file?
>

Maybe Grammar/Tokens?


>
> There is a related problem, the tokenize module uses three additional
> tokens not used by the C tokenizer. It modifies the content of the token
> module after importing it, that is not good. [2] One of solutions is
> making a copy of tok_names in tokenize before modifying it, but this
> doesn't work, because third-party code search tokenize constants in
> token.tok_names. Other solution is adding tokenize specific constants to
> the token module. Is this good to expose in the token module tokens not
> used in the C tokenizer?
>

No opinion from me.


>
> Non-terminal symbols are generated automatically, Lib/symbol.py from
> Include/graminit.h, and Include/graminit.h and Python/graminit.c from
> Grammar/Grammar by Parser/pgen. Is it worth to generate Lib/symbol.py by
> pgen too? Can pgen be implemented in Python?
>

I assume there's a build rule for Python/graminit.c and porting pgen to
Python would require Python be installed to do a build from scratch. Have
we made that a requirement yet? If so then rewriting pgen in Python would
make it easier to maintain (although when was the last time anyone touched
that file?).

-Brett


>
> See also similar issue for opcodes. [3]
>
> [1] https://bugs.python.org/issue30455
> [2] https://bugs.python.org/issue25324
> [3] https://bugs.python.org/issue17861
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20170531/37186db8/attachment.html>


More information about the Python-Dev mailing list