On Tue, Apr 13, 2021 at 12:55 PM Serhiy Storchaka <storchaka@gmail.com> wrote:

26.04.18 21:37, Serhiy Storchaka пише:
> In Python 2.5 `0or[]` was accepted by the Python parser. It became an
> error in 2.6 because "0o" became recognizing as an incomplete octal
> number. `1or[]` still is accepted.
>
> On other hand, `1if 2else 3` is accepted despites the fact that "2e" can
> be recognized as an incomplete floating point number. In this case the
> tokenizer pushes "e" back and returns "2".
>
> Shouldn't it do the same with "0o"? It is possible to make `0or[]` be
> parseable again. Python implementation is able to tokenize this example:
>
> $ echo '0or[]' | ./python -m tokenize
> 1,0-1,1:            NUMBER         '0'
> 1,1-1,3:            NAME           'or'
> 1,3-1,4:            OP             '['
> 1,4-1,5:            OP             ']'
> 1,5-1,6:            NEWLINE        '\n'
> 2,0-2,0:            ENDMARKER      ''
>
> On other hand, all these examples look weird. There is an assymmetry:
> `1or 2` is a valid syntax, but `1 or2` is not. It is hard to recognize
> visually the boundary between a number and the following identifier or
> keyword, especially if numbers can contain letters ("b", "e", "j", "o",
> "x") and underscores, and identifiers can contain digits. On both sides
> of the boundary can be letters, digits, and underscores.
>
> I propose to change the Python syntax by adding a requirement that there
> should be a whitespace or delimiter between a numeric literal and the
> following keyword.
>

New example was found recently (see https://bugs.python.org/issue43833).

>>> [0x1for x in (1,2)]
[31]

It is parsed as [0x1f or x in (1,2)] instead of [0x1 for x in (1,2)].

Since this code is clearly ambiguous, it makes more sense to emit a
SyntaxWarning if there is no space between number and identifier.

I would totally make that a SyntaxError, and backwards compatibility be damned.

--Guido van Rossum (python.org/~guido)

Pronouns: he/him (why is my pronoun here?)