
One of the more opaque error messages new Python users can encounter is a syntax error due to unmatched parentheses: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: invalid syntax While I have no idea how we could implement it, I'm wondering if that might be clearer if the error message instead looked more like this: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: invalid syntax (Unmatched '(' on line 10) Or, similarly, SyntaxError: invalid syntax (Unmatched '[' on line 10) SyntaxError: invalid syntax (Unmatched '{' on line 10) I'm not sure it would be feasible though - we generate syntax errors from a range of locations where we don't have access to the original token data any more :( Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

* Nick Coghlan <ncoghlan@gmail.com> [2015-07-08 15:53:44 +1000]:
I can't comment on the implementation - but I can confirm this comes up a lot in the #python IRC channel. People usually paste that line and ask what is wrong with it, and the answer usually is "look on the previous line". I can imagine having this error message instead would help people understand what's going on, without having to *know* they should look at the previous line when there's a SyntaxError which "makes no sense". Florian -- http://www.the-compiler.org | me@the-compiler.org (Mail/XMPP) GPG: 916E B0C8 FD55 A072 | http://the-compiler.org/pubkey.asc I love long mails! | http://email.is-not-s.ms/

On Jul 7, 2015, at 22:53, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we know that there's an unbalanced parens (or bracket or brace) even when we don't know where it is? I think this would still be sufficient: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: invalid syntax (Unmatched '(', possibly on a previous line) Really, just telling people to look at a previous line for unmatched pairs is sufficient to solve the problem every time it comes up on #python, StackOverflow, etc. (Of course this would take away a great opportunity to explain to novices why they might want to use a better editor than Notepad, something which can show them mismatched parens automatically.)

On 08/07/2015 08:30, Andrew Barnert via Python-ideas wrote:
Add something here https://docs.python.org/3/tutorial/errors.html#syntax-errors taking into account both of the above paragraphs? Put it in the FAQs, if it's there already I missed it :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

Nick Coghlan writes:
Agreed. Could be worse, though, it could be Lisp!
I think I would prefer "Expected ')'". I think that typos like a = ((1, "one"), (2, "two)", (3, "three")) data = func() are likely to be fairly common (I make them often enough!), but I don't see how you're going to get the parser to identify the line containing "couple #2" as the source of the error (without a *really* dubious heuristic).
Is the problem that we don't know which line the unmatched parenthesis was on, or that we don't even know that the syntax error is an unmatched parenthesis?

On Jul 8, 2015 10:02 AM, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
True, but we can definitely say it occurs on or after the first line in your example. So could we do something like: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: Unmatched '(' starting somewhere after line 7 That would at least allow you to narrow down where to look for the problem. Of course in cases where the starting point is the current line or the previous line you could, in principle, have simpler exception messages. It may not be worth the increased complexity or decreased consistency, though.
I think there are two problems. First, it isn't clear what the problem is. Second, it is misleading about where the problem occurs. So I think the goal would be to have an exception that states what the problem is and doesn't give the wrong place to look for the problem. It doesn't have to give the right place, but it can't give the wrong place.

On Wed, Jul 08, 2015 at 11:03:42AM +0200, Todd wrote:
Even if you can't report a line number, you could report "on this or a previous line". The way I see it, if the parser knows enough to point the ^ before the first token on the line, it can report that there is a missing ) on a previous line, otherwise it may have to hedge. SyntaxError: Unmatched '(' before this line SyntaxError: Unmatched '(' on this or a previous line I believe that this would be a big help to beginners and casual users such as sys admins. Experienced programmers have learned the hard way that a SyntaxError may mean an unmatched bracket of some kind, but I think it would help even experienced coders to be explicit about the error. -- Steve

On Wed, Jul 8, 2015 at 9:33 PM, Steven D'Aprano <steve@pearwood.info> wrote:
It's worth noting that this isn't peculiar to Python. *Any* error that gives a location can potentially have been caused by a misinterpretation of a previous line. So if it's too hard to say exactly where the likely problem is, I think it'd still be of value to suggest looking a line or two above for the actual problem. ChrisA

On Wed, Jul 8, 2015 at 1:33 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I think it should always be possible to report a range of line numbers within which the problem must occur. The start would be the outermost unclosed parenthesis. The end would the first statement that cannot be in a parentheses or the end of the file, whichever comes first (the latter is currently listed as the location of the exception). Although we can't say exactly where the problem occurs in this range, I think we can say that it must be somewhere in this range. So it should be possible to do something like this (I don't like the error message, this is just an example): File "/home/me/myfile.py", line 8:12 a = ((1, "one"), ^ ... data = func() ^ SyntaxError: Unmatched '(' in line range However, I don't know if this exception message structure could be a problem. Hence my original proposal, which would keep a simpler exception message.

On 07/08/2015 01:53 AM, Nick Coghlan wrote:
I don't think "invalid syntax" is needed here. SyntaxError is enough.
Possibly another way to do this is to create a "SyntaxError token" in the parser with the needed information, then raise it if it's found in a later step. These aren't always found at the end of the file, they can come up when a brace or parentheses is mismatched. Currently those generate the syntax error at the end location, but they could say why and where the other brace is at. SyntaxError: found ] , instead of ) I think it would be better if the message's did not contain the location, and that part was moved to the traceback instead. Have a more general non location dependent error message is helpful for comparing similar Exceptions without having to filter out the numbers which can change between edits. File "/home/me/myfile.py", line 10 to 11 <----- # here data = func() ^ SyntaxError: unmatched '(' <---- not here Cheers. Ron

On 7/8/2015 1:53 AM, Nick Coghlan wrote:
Could that be changed? An alternate approach is a separate fence-matcher function. Before I switched to Python 17+ years ago, I wrote a table-driven finite-state-machine matcher in C and a complete table for K&R/C89 C, which included info that openers were be be ignored within comments and strings. It reported the line and column of unclosed openers. I wrote it for my own use because I was frustrated by poor C compiler error messages. I have occasionally thought about developing a table for Python (and rewriting in Python), but indents and dedents are not trivial. (Even tokenizer.py does not handle \t indents correctly.) Maybe I should think a bit harder. Idle has an option to syntax-check a module without running it. If compile messages are not improved, it would certainly be sensible to run a separate fence-checker at least when check-only is requested, for better error messages. These could potentially include 'missing :' when a header 'opened' by for/while/if/elif/else/class/def/with is not closed by ':'. -- Terry Jan Reedy

On 9 July 2015 at 08:03, Terry Reedy <tjreedy@udel.edu> wrote:
I think we're already down to only having four places where they can be thrown (tokeniser, parser, symbol table analysis, byte code generator), so reducing it further seems unlikely.
That sounds like a plausible direction, as it turned out the particular case that prompted this thread wasn't due to missing parentheses at all, it was a block of code like: try: .... statement dedented early except ...: ... I think Stephen Turnbull may also be on to something: we don't necessarily need to tell the user what fenced token was unmatched from earlier, it may be enough to tell them what *would* have been acceptable as the next token where the caret is pointing so they have something more specific to consider than "invalid syntax". For example, in the case I was attempting to help debug remotely, the error message might have been: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: expected "except" or "finally" Other fence errors would then be: SyntaxError: expected ":" SyntaxError: expected ")" SyntaxError: expected "]" SyntaxError: expected "}" SyntaxError: expected "import" # from ... import ... SyntaxError: expected "else" # ... if ... else ... SyntaxError: expected "in" # for ... in ... And once 'async' is a proper keyword: SyntaxError: expected "def", "with" or "for" # async ... The currently problematic cases are those in https://docs.python.org/3/reference/grammar.html where seeing "foo" at one point in the token stream sets up the expectation in the parser that "bar" must appear a bit further along. At the moment, the parser bails out saying "I wasn't expecting this!", and doesn't answer the obvious follow on question "Well, what *were* you expecting?". Strings would also qualify for a similar kind of treatment, as the current error message doesn't tell us whether the parser was looking for closing single or double quotes: $ python3 -c "'" File "<string>", line 1 ' ^ SyntaxError: EOL while scanning string literal $ python3 -c "'''" File "<string>", line 1 ''' ^ SyntaxError: EOF while scanning triple-quoted string literal $ python3 -c '"' File "<string>", line 1 " ^ SyntaxError: EOL while scanning string literal $ python3 -c '"""' File "<string>", line 1 """ ^ SyntaxError: EOF while scanning triple-quoted string literal This discussion has headed into a part of the compiler chain that I don't actually know myself, though - the only thing I've ever had to do with the parser is modifying the grammar file and adding the brute force error message override when someone leaves out the parentheses on print() and exec() calls. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

* Nick Coghlan <ncoghlan@gmail.com> [2015-07-08 15:53:44 +1000]:
I can't comment on the implementation - but I can confirm this comes up a lot in the #python IRC channel. People usually paste that line and ask what is wrong with it, and the answer usually is "look on the previous line". I can imagine having this error message instead would help people understand what's going on, without having to *know* they should look at the previous line when there's a SyntaxError which "makes no sense". Florian -- http://www.the-compiler.org | me@the-compiler.org (Mail/XMPP) GPG: 916E B0C8 FD55 A072 | http://the-compiler.org/pubkey.asc I love long mails! | http://email.is-not-s.ms/

On Jul 7, 2015, at 22:53, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we know that there's an unbalanced parens (or bracket or brace) even when we don't know where it is? I think this would still be sufficient: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: invalid syntax (Unmatched '(', possibly on a previous line) Really, just telling people to look at a previous line for unmatched pairs is sufficient to solve the problem every time it comes up on #python, StackOverflow, etc. (Of course this would take away a great opportunity to explain to novices why they might want to use a better editor than Notepad, something which can show them mismatched parens automatically.)

On 08/07/2015 08:30, Andrew Barnert via Python-ideas wrote:
Add something here https://docs.python.org/3/tutorial/errors.html#syntax-errors taking into account both of the above paragraphs? Put it in the FAQs, if it's there already I missed it :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

Nick Coghlan writes:
Agreed. Could be worse, though, it could be Lisp!
I think I would prefer "Expected ')'". I think that typos like a = ((1, "one"), (2, "two)", (3, "three")) data = func() are likely to be fairly common (I make them often enough!), but I don't see how you're going to get the parser to identify the line containing "couple #2" as the source of the error (without a *really* dubious heuristic).
Is the problem that we don't know which line the unmatched parenthesis was on, or that we don't even know that the syntax error is an unmatched parenthesis?

On Jul 8, 2015 10:02 AM, "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
True, but we can definitely say it occurs on or after the first line in your example. So could we do something like: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: Unmatched '(' starting somewhere after line 7 That would at least allow you to narrow down where to look for the problem. Of course in cases where the starting point is the current line or the previous line you could, in principle, have simpler exception messages. It may not be worth the increased complexity or decreased consistency, though.
I think there are two problems. First, it isn't clear what the problem is. Second, it is misleading about where the problem occurs. So I think the goal would be to have an exception that states what the problem is and doesn't give the wrong place to look for the problem. It doesn't have to give the right place, but it can't give the wrong place.

On Wed, Jul 08, 2015 at 11:03:42AM +0200, Todd wrote:
Even if you can't report a line number, you could report "on this or a previous line". The way I see it, if the parser knows enough to point the ^ before the first token on the line, it can report that there is a missing ) on a previous line, otherwise it may have to hedge. SyntaxError: Unmatched '(' before this line SyntaxError: Unmatched '(' on this or a previous line I believe that this would be a big help to beginners and casual users such as sys admins. Experienced programmers have learned the hard way that a SyntaxError may mean an unmatched bracket of some kind, but I think it would help even experienced coders to be explicit about the error. -- Steve

On Wed, Jul 8, 2015 at 9:33 PM, Steven D'Aprano <steve@pearwood.info> wrote:
It's worth noting that this isn't peculiar to Python. *Any* error that gives a location can potentially have been caused by a misinterpretation of a previous line. So if it's too hard to say exactly where the likely problem is, I think it'd still be of value to suggest looking a line or two above for the actual problem. ChrisA

On Wed, Jul 8, 2015 at 1:33 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I think it should always be possible to report a range of line numbers within which the problem must occur. The start would be the outermost unclosed parenthesis. The end would the first statement that cannot be in a parentheses or the end of the file, whichever comes first (the latter is currently listed as the location of the exception). Although we can't say exactly where the problem occurs in this range, I think we can say that it must be somewhere in this range. So it should be possible to do something like this (I don't like the error message, this is just an example): File "/home/me/myfile.py", line 8:12 a = ((1, "one"), ^ ... data = func() ^ SyntaxError: Unmatched '(' in line range However, I don't know if this exception message structure could be a problem. Hence my original proposal, which would keep a simpler exception message.

On 07/08/2015 01:53 AM, Nick Coghlan wrote:
I don't think "invalid syntax" is needed here. SyntaxError is enough.
Possibly another way to do this is to create a "SyntaxError token" in the parser with the needed information, then raise it if it's found in a later step. These aren't always found at the end of the file, they can come up when a brace or parentheses is mismatched. Currently those generate the syntax error at the end location, but they could say why and where the other brace is at. SyntaxError: found ] , instead of ) I think it would be better if the message's did not contain the location, and that part was moved to the traceback instead. Have a more general non location dependent error message is helpful for comparing similar Exceptions without having to filter out the numbers which can change between edits. File "/home/me/myfile.py", line 10 to 11 <----- # here data = func() ^ SyntaxError: unmatched '(' <---- not here Cheers. Ron

On 7/8/2015 1:53 AM, Nick Coghlan wrote:
Could that be changed? An alternate approach is a separate fence-matcher function. Before I switched to Python 17+ years ago, I wrote a table-driven finite-state-machine matcher in C and a complete table for K&R/C89 C, which included info that openers were be be ignored within comments and strings. It reported the line and column of unclosed openers. I wrote it for my own use because I was frustrated by poor C compiler error messages. I have occasionally thought about developing a table for Python (and rewriting in Python), but indents and dedents are not trivial. (Even tokenizer.py does not handle \t indents correctly.) Maybe I should think a bit harder. Idle has an option to syntax-check a module without running it. If compile messages are not improved, it would certainly be sensible to run a separate fence-checker at least when check-only is requested, for better error messages. These could potentially include 'missing :' when a header 'opened' by for/while/if/elif/else/class/def/with is not closed by ':'. -- Terry Jan Reedy

On 9 July 2015 at 08:03, Terry Reedy <tjreedy@udel.edu> wrote:
I think we're already down to only having four places where they can be thrown (tokeniser, parser, symbol table analysis, byte code generator), so reducing it further seems unlikely.
That sounds like a plausible direction, as it turned out the particular case that prompted this thread wasn't due to missing parentheses at all, it was a block of code like: try: .... statement dedented early except ...: ... I think Stephen Turnbull may also be on to something: we don't necessarily need to tell the user what fenced token was unmatched from earlier, it may be enough to tell them what *would* have been acceptable as the next token where the caret is pointing so they have something more specific to consider than "invalid syntax". For example, in the case I was attempting to help debug remotely, the error message might have been: File "/home/me/myfile.py", line 11 data = func() ^ SyntaxError: expected "except" or "finally" Other fence errors would then be: SyntaxError: expected ":" SyntaxError: expected ")" SyntaxError: expected "]" SyntaxError: expected "}" SyntaxError: expected "import" # from ... import ... SyntaxError: expected "else" # ... if ... else ... SyntaxError: expected "in" # for ... in ... And once 'async' is a proper keyword: SyntaxError: expected "def", "with" or "for" # async ... The currently problematic cases are those in https://docs.python.org/3/reference/grammar.html where seeing "foo" at one point in the token stream sets up the expectation in the parser that "bar" must appear a bit further along. At the moment, the parser bails out saying "I wasn't expecting this!", and doesn't answer the obvious follow on question "Well, what *were* you expecting?". Strings would also qualify for a similar kind of treatment, as the current error message doesn't tell us whether the parser was looking for closing single or double quotes: $ python3 -c "'" File "<string>", line 1 ' ^ SyntaxError: EOL while scanning string literal $ python3 -c "'''" File "<string>", line 1 ''' ^ SyntaxError: EOF while scanning triple-quoted string literal $ python3 -c '"' File "<string>", line 1 " ^ SyntaxError: EOL while scanning string literal $ python3 -c '"""' File "<string>", line 1 """ ^ SyntaxError: EOF while scanning triple-quoted string literal This discussion has headed into a part of the compiler chain that I don't actually know myself, though - the only thing I've ever had to do with the parser is modifying the grammar file and adding the brute force error message override when someone leaves out the parentheses on print() and exec() calls. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 7/9/2015 7:58 AM, Nick Coghlan wrote: [... message might have been]
The opposite problem is a closer without an opener.
SyntaxError: invalid syntax In both cases, "unexpected '{}'.format(obj) would be better, even without given the missing opener. -- Terry Jan Reedy
participants (11)
-
Andrew Barnert
-
Chris Angelico
-
Florian Bruhin
-
Mark Lawrence
-
Matthias Bussonnier
-
Nick Coghlan
-
Ron Adam
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy
-
Todd