[New-bugs-announce] [issue33899] Tokenize module does not mirror "end-of-input" is newline behavior
Ammar Askar
report at bugs.python.org
Tue Jun 19 03:41:52 EDT 2018
New submission from Ammar Askar <ammar at ammaraskar.com>:
As was pointed out in https://bugs.python.org/issue33766 there is an edge case in the tokenizer whereby it will implicitly treat the end of input as a newline. The tokenize module in stdlib does not mirror the C code's behavior in this case.
tokenizer.c:
~/cpython $ echo -n 'x' | ./python
----------
NAME ("x")
NEWLINE
ENDMARKER
tokenize module:
~/cpython $ echo -n 'x' | ./python -m tokenize
1,0-1,1: NAME 'x'
2,0-2,0: ENDMARKER ''
The instrumentation to have the C tokenizer dump out its tokens is mine, can provide a diff to produce that output if needed.
----------
assignee: ammar2
components: Library (Lib)
messages: 319934
nosy: ammar2
priority: normal
severity: normal
status: open
title: Tokenize module does not mirror "end-of-input" is newline behavior
type: behavior
versions: Python 3.8
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue33899>
_______________________________________
More information about the New-bugs-announce
mailing list