Where can I find a lexical spec of python?
csf178 at 163.com
Wed Sep 21 12:33:03 EDT 2011
I've read the document http://docs.python.org/py3k/reference/lexical_analysis.html
but I worried it might leak some language features like "tab magic".
Currently I have a highlighter here ->http://shaofei.name/python/PyHighlighter.html
(Also the lexer http://shaofei.name/python/PyLexer.html)
As you can see, I just make its behavior align with CPython, but I'm not sure what the real python lexical grammar is like.
Does anyone know if there is a lexical grammar spec like other languages(e.g. http://bclary.com/2004/11/07/#annex-a)?
Please help me. Thanks a lot.
在 2011-09-21 19:41:33，"Thomas Jollans" <t at jollybox.de> 写道：
>On 21/09/11 11:44, 程劭非 wrote:
>> Hi, everyone,
>> I've found there was several tokens used in python's
>> grammar(http://docs.python.org/reference/grammar.html) but I didn't see
>> their definition anywhere. The tokens listed here:
>They should be documented in
>http://docs.python.org/py3k/reference/lexical_analysis.html - though
>apparently not using these exact terms.
>End of file.
>documented as "identifier" in 2.3
>Documented in 2.1.8.
>Documented in 2.4.3 - 2.4.6
>Documented in 2.4.2
>> I've got some infomations from the source
>> code(http://svn.python.org/projects/python/trunk/Parser/tokenizer.c) but
>> I'm not sure which feature is only for this specified implementaion. (I
>> saw tabstop could be modified with comments using "tab-width:",
>> ":tabstop=", ":ts=" or "set tabsize=", is this feature really in spec?)
>That sounds like a legacy feature that is no longer used. Somebody
>familiar with the early history of Python might be able to shed more
>light on the situation. It is inconsisten with the spec (section 2.1.8):
>Indentation is rejected as inconsistent if a source file mixes tabs and
>spaces in a way that makes the meaning dependent on the worth of a tab
>in spaces; a TabError is raised in that case.
More information about the Python-list