[Python-ideas] Reporting unmatched parentheses in SyntaxError messages?

Nick Coghlan ncoghlan at gmail.com
Thu Jul 9 13:58:58 CEST 2015


On 9 July 2015 at 08:03, Terry Reedy <tjreedy at udel.edu> wrote:
> On 7/8/2015 1:53 AM, Nick Coghlan wrote:
>>
>> One of the more opaque error messages new Python users can encounter
>> is a syntax error due to unmatched parentheses:
>>
>>      File "/home/me/myfile.py", line 11
>>          data = func()
>>          ^
>>      SyntaxError: invalid syntax
>
>
>> I'm not sure it would be feasible though - we generate syntax errors
>> from a range of locations where we don't have access to the original
>> token data any more :(
>
>
> Could that be changed?

I think we're already down to only having four places where they can
be thrown (tokeniser, parser, symbol table analysis, byte code
generator), so reducing it further seems unlikely.

> I have occasionally thought about developing a table for Python (and
> rewriting in Python), but indents and dedents are not trivial.  (Even
> tokenizer.py does not handle \t indents correctly.)  Maybe I should think a
> bit harder.  Idle has an option to syntax-check a module without running it.
> If compile messages are not improved, it would certainly be sensible to run
> a separate fence-checker at least when check-only is requested, for better
> error messages.  These could potentially include 'missing :' when a header
> 'opened' by for/while/if/elif/else/class/def/with is not closed by ':'.

That sounds like a plausible direction, as it turned out the
particular case that prompted this thread wasn't due to missing
parentheses at all, it was a block of code like:

    try:
        ....
    statement dedented early

    except ...:
        ...

I think Stephen Turnbull may also be on to something: we don't
necessarily need to tell the user what fenced token was unmatched from
earlier, it may be enough to tell them what *would* have been
acceptable as the next token where the caret is pointing so they have
something more specific to consider than "invalid syntax". For
example, in the case I was attempting to help debug remotely, the
error message might have been:

    File "/home/me/myfile.py", line 11
        data = func()
        ^
    SyntaxError: expected "except" or "finally"

Other fence errors would then be:

    SyntaxError: expected ":"
    SyntaxError: expected ")"
    SyntaxError: expected "]"
    SyntaxError: expected "}"
    SyntaxError: expected "import"  # from ... import ...
    SyntaxError: expected "else"  # ... if ... else ...
    SyntaxError: expected "in"  # for ... in ...

And once 'async' is a proper keyword:

    SyntaxError: expected "def", "with" or "for" # async ...

The currently problematic cases are those in
https://docs.python.org/3/reference/grammar.html where seeing "foo" at
one point in the token stream sets up the expectation in the parser
that "bar" must appear a bit further along. At the moment, the parser
bails out saying "I wasn't expecting this!", and doesn't answer the
obvious follow on question "Well, what *were* you expecting?".

Strings would also qualify for a similar kind of treatment, as the
current error message doesn't tell us whether the parser was looking
for closing single or double quotes:

$ python3 -c "'"
 File "<string>", line 1
   '
   ^
SyntaxError: EOL while scanning string literal
$ python3 -c "'''"
 File "<string>", line 1
   '''
     ^
SyntaxError: EOF while scanning triple-quoted string literal
$ python3 -c '"'
 File "<string>", line 1
   "
   ^
SyntaxError: EOL while scanning string literal
$ python3 -c '"""'
 File "<string>", line 1
   """
     ^
SyntaxError: EOF while scanning triple-quoted string literal

This discussion has headed into a part of the compiler chain that I
don't actually know myself, though - the only thing I've ever had to
do with the parser is modifying the grammar file and adding the brute
force error message override when someone leaves out the parentheses
on print() and exec() calls.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list