[issue1679] tokenizer permits invalid hex integer
Malte Helmert
report at bugs.python.org
Sat Jan 19 17:56:06 CET 2008
Malte Helmert added the comment:
I can find three places where "0x" is accepted, but probably shouldn't:
1. Python's tokenizer:
>>> 0x
0
>>> 0xL
ValueError: invalid literal for long() with base 16: '0xL'
=> I think these should both be syntax errors.
2. int builtin:
>>> int("0x", 0) == int("0x", 16) == 0
True
>>> long("0x", 0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for long() with base 16: '0x'
>>> long("0x", 16)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for long()
=> The long behaviour looks right to me, and I think the int behaviour
should match it.
3. tokenize module:
This currently accepts "0x" and "0xL" as single tokens. The obvious fix
would lead to these two being reported as two separate tokens ("0":
NUMBER, "x": NAME; "0": NUMBER, "xL": NAME), as it currently does for
other cases where a name follows a number like "23cats". However, this
is not quite what Python's parser does, which returns an error token
instead. (Fortunately, name after number appears to be a syntax error
everywhere, so it doesn't really affect the behaviour; a syntax error
occurs either way.)
----------
nosy: +maltehelmert
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1679>
__________________________________
More information about the Python-bugs-list
mailing list