Tighten up the formal grammar and parsing a bit?

I wrote this little Python program using CPython 3.5.2. It's ... interesting ... that we apparently don't need comments or pass statements any more. Anyone else think it might be worth tightening up the grammar definition and parser a bit? def empty(): """Don't do anything""" def helloWorld(): """Docstring""" x = 0 if x > 0: """Pass""" else: x += 1 print(x) """Comment that is a string or vice versa""" x = 2 print(x) if x == 2: x += 1 ;"Add 1 to x" print(x) if x == 3: 42 print("Answered everything") if __name__ == "__main__": helloWorld() print(empty()) -- cheers, Hugh Fisher

On Mon, May 15, 2017 at 7:38 PM, Hugh Fisher <hugo.fisher@gmail.com> wrote:
Nope. For starters, you shouldn't be using "pass" statements OR dummy strings to fill in an if statement's body; you can instead simply write: if x <= 0: x += 1 Or worst case: if not (x > 0): x += 1 For the rest, all you've shown is that trivial expressions consisting only of string literals will be ignored in certain contexts. The trouble is that string literals don't really mean comments, and won't be ignored by most humans; plus, there are contexts where they are not ignored. Here, rewrite this without comments: wrong_answer_messages = [ "Wrong.", "Totally wrong, you moron.", "Bob, you idiot, that answer is not right. Cordially, Ted.", # Maize "That's as useful as a screen door on a battleship.", # BTTF # etc ] String literals won't work here, and even if they did, they would be _extremely_ confusing. Comments are semantically distinct. The 'pass' statement has a very specific meaning and only a few use-cases. It could often be omitted in favour of something else, but there's not a lot of value in doing so. Comments have very significant value and should definitely be kept. ChrisA

On Mon, May 15, 2017 at 08:13:48PM +1000, Chris Angelico wrote:
I agree with that. But not necessarily the following:
That is often the case, but there are times where a condition is clearer with a pass statement followed by an else than by reversing the sense of the test. Or the pass might just be a place-holder: TDD often means that there's code where only one branch of an if works (and it's not necessarily the if branch). if condition: pass # will be fixed in the next iteration of TDD else: code There's also cases where if x > y: pass else: code is *not necessarily* the same as if not (x > y): code (x > y) is not always not(x <= y). E.g. sets, and even floats.
For the rest, all you've shown is that trivial expressions consisting only of string literals will be ignored in certain contexts.
The trouble is that string literals don't really mean comments, and won't be ignored by most humans;
Bare string literals do sometimes mean comments, and I should hope they aren't ignored by the reader! E.g. bare strings at the start of a module, class or function are docstrings, and even in the middle of the module or function, they are allowed. Guido has spoken! (Unless he's changed his mind since then :-) https://twitter.com/gvanrossum/status/112670605505077248
plus, there are contexts where they are not ignored.
Oh, and here I was thinking strings were ignored everywhere! print("hello world") # does nothing *wink* But seriously, of course *expression statements* which are string literals are not syntactically comments, but they can be, and are, treated as if they were. Just use a bit of common sense. Here, rewrite this without comments:
That's because the statement is an assignment statement, not an expression statement: https://docs.python.org/3/reference/simple_stmts.html#grammar-token-expressi...
Oh, I see where you are coming from! You have interpreted Hugh as suggesting that we remove pass and # comments from the language! I interpreted him as suggesting the opposite: that we tighten up the grammar to prohibit bare expressions, in order to prevent them from being used instead of pass or # comments. -- Steve

On Mon, May 15, 2017 at 11:00 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Uhm.... not sure what you're getting at here. I'm fully aware that: if x > y: pass else: code is not the same as: if x <= y: code but I don't know of any way that it could be different from: if not (x > y): code because that's going to evaluate (x > y) exactly the same way the original would, and then perform a boolean negation on it, which is exactly the same as the if/else will do. Or have I missed something here?
Yes, that was what I was interpreting his statements as. I now know better, so you can ignore a lot of my comments, which were about that :) So. Taking this the other way, that Hugh intended to make dumb code illegal: I think it's unnecessary, because linters and optimizers are better for detecting dead code; it's not something that often crops up as a bug anywhere. ChrisA

I guess maybe if you overload the operators to return broken objects, maybe then they would be different? -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On May 15, 2017 9:50 AM, "Serhiy Storchaka" <storchaka@gmail.com> wrote:

On Mon, May 15, 2017 at 07:38:29PM +1000, Hugh Fisher wrote:
I'm not sure what you mean by "any more". The code you give works, unchanged, all the way back to Python 2.0 when augmented assignment was added. If you replace the x += 1 with x = x + 1 it works all the way back to Python 1.5 and probably even older. Python has (more or less) always supported arbitrary expressions as statements, so this is not new. This is a feature, not a bug: supporting expressions as statements is necessary for expressions like: alist.sort() and other expressions with side-effects. Unfortunately, that means that pointless expressions like: 42 that have no purpose are also legal. In recent versions, the compiler has a peephole optimizer that removes at least some constant expressions: # Python 3.5 py> block = """x = 1 ... 'some string' ... 100 ... y = 2 ... """ py> code = compile(block, '', 'exec') py> from dis import dis py> dis(code) 1 0 LOAD_CONST 0 (1) 3 STORE_NAME 0 (x) 4 6 LOAD_CONST 1 (2) 9 STORE_NAME 1 (y) 12 LOAD_CONST 2 (None) 15 RETURN_VALUE There's also a (weak) convention that bare string literals are intended as pseudo-constants. That's especially handy with triple-quoted strings, since they can comment-out multiple lines.
Anyone else think it might be worth tightening up the grammar definition and parser a bit?
Not me. In fact, I'd go further than just saying "I don't think it is worthwhile". I'll say that treating bare strings as pseudo-comments is a positive feature worth keeping. Tightening up the grammar to prohibit that is a bad thing. There's an argument to be made that bare expressions like: 100 are pointless, but it isn't a strong argument. In practice, it isn't really a common source of errors, and as far as efficiency goes, the peephole optimizer solves that. And its easy to get the rules wrong. For instance, at first I thought that a bare name lookup like: x could be safely optimized away, or prohibited, but it can't. It is true that a successful name lookup will do nothing, but not all lookups are successful: try: next except NameError: # Python version is too old def next(iterator): return iterator.next() If we prohibit bare name lookups, that will break a lot of working code. I suppose it is possible that a *sufficiently intelligent* compiler could recognise bare expressions that have no side-effects, and prohibit them, and that this might prevent some rare, occasional errors: x #= 1 # oops I meant += but honestly, I don't see that this is a good use of developer's time. It adds complexity to the language, risks false positives, and in my opinion is the sort of thing that is better flagged by a linter, not prohibited by the interpreter. -- Steve

On Mon, May 15, 2017 at 7:38 PM, Hugh Fisher <hugo.fisher@gmail.com> wrote:
Nope. For starters, you shouldn't be using "pass" statements OR dummy strings to fill in an if statement's body; you can instead simply write: if x <= 0: x += 1 Or worst case: if not (x > 0): x += 1 For the rest, all you've shown is that trivial expressions consisting only of string literals will be ignored in certain contexts. The trouble is that string literals don't really mean comments, and won't be ignored by most humans; plus, there are contexts where they are not ignored. Here, rewrite this without comments: wrong_answer_messages = [ "Wrong.", "Totally wrong, you moron.", "Bob, you idiot, that answer is not right. Cordially, Ted.", # Maize "That's as useful as a screen door on a battleship.", # BTTF # etc ] String literals won't work here, and even if they did, they would be _extremely_ confusing. Comments are semantically distinct. The 'pass' statement has a very specific meaning and only a few use-cases. It could often be omitted in favour of something else, but there's not a lot of value in doing so. Comments have very significant value and should definitely be kept. ChrisA

On Mon, May 15, 2017 at 08:13:48PM +1000, Chris Angelico wrote:
I agree with that. But not necessarily the following:
That is often the case, but there are times where a condition is clearer with a pass statement followed by an else than by reversing the sense of the test. Or the pass might just be a place-holder: TDD often means that there's code where only one branch of an if works (and it's not necessarily the if branch). if condition: pass # will be fixed in the next iteration of TDD else: code There's also cases where if x > y: pass else: code is *not necessarily* the same as if not (x > y): code (x > y) is not always not(x <= y). E.g. sets, and even floats.
For the rest, all you've shown is that trivial expressions consisting only of string literals will be ignored in certain contexts.
The trouble is that string literals don't really mean comments, and won't be ignored by most humans;
Bare string literals do sometimes mean comments, and I should hope they aren't ignored by the reader! E.g. bare strings at the start of a module, class or function are docstrings, and even in the middle of the module or function, they are allowed. Guido has spoken! (Unless he's changed his mind since then :-) https://twitter.com/gvanrossum/status/112670605505077248
plus, there are contexts where they are not ignored.
Oh, and here I was thinking strings were ignored everywhere! print("hello world") # does nothing *wink* But seriously, of course *expression statements* which are string literals are not syntactically comments, but they can be, and are, treated as if they were. Just use a bit of common sense. Here, rewrite this without comments:
That's because the statement is an assignment statement, not an expression statement: https://docs.python.org/3/reference/simple_stmts.html#grammar-token-expressi...
Oh, I see where you are coming from! You have interpreted Hugh as suggesting that we remove pass and # comments from the language! I interpreted him as suggesting the opposite: that we tighten up the grammar to prohibit bare expressions, in order to prevent them from being used instead of pass or # comments. -- Steve

On Mon, May 15, 2017 at 11:00 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Uhm.... not sure what you're getting at here. I'm fully aware that: if x > y: pass else: code is not the same as: if x <= y: code but I don't know of any way that it could be different from: if not (x > y): code because that's going to evaluate (x > y) exactly the same way the original would, and then perform a boolean negation on it, which is exactly the same as the if/else will do. Or have I missed something here?
Yes, that was what I was interpreting his statements as. I now know better, so you can ignore a lot of my comments, which were about that :) So. Taking this the other way, that Hugh intended to make dumb code illegal: I think it's unnecessary, because linters and optimizers are better for detecting dead code; it's not something that often crops up as a bug anywhere. ChrisA

I guess maybe if you overload the operators to return broken objects, maybe then they would be different? -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On May 15, 2017 9:50 AM, "Serhiy Storchaka" <storchaka@gmail.com> wrote:

Something broken like this? import inspect def cond(): if 'not cond' in inspect.stack()[1].code_context[0]: return False return True if cond(): print('yes') else: print('no') if not cond(): print('no') else: print('yes') On 5/15/17, Ryan Gonzalez <rymg19@gmail.com> wrote:

On Mon, May 15, 2017 at 07:38:29PM +1000, Hugh Fisher wrote:
I'm not sure what you mean by "any more". The code you give works, unchanged, all the way back to Python 2.0 when augmented assignment was added. If you replace the x += 1 with x = x + 1 it works all the way back to Python 1.5 and probably even older. Python has (more or less) always supported arbitrary expressions as statements, so this is not new. This is a feature, not a bug: supporting expressions as statements is necessary for expressions like: alist.sort() and other expressions with side-effects. Unfortunately, that means that pointless expressions like: 42 that have no purpose are also legal. In recent versions, the compiler has a peephole optimizer that removes at least some constant expressions: # Python 3.5 py> block = """x = 1 ... 'some string' ... 100 ... y = 2 ... """ py> code = compile(block, '', 'exec') py> from dis import dis py> dis(code) 1 0 LOAD_CONST 0 (1) 3 STORE_NAME 0 (x) 4 6 LOAD_CONST 1 (2) 9 STORE_NAME 1 (y) 12 LOAD_CONST 2 (None) 15 RETURN_VALUE There's also a (weak) convention that bare string literals are intended as pseudo-constants. That's especially handy with triple-quoted strings, since they can comment-out multiple lines.
Anyone else think it might be worth tightening up the grammar definition and parser a bit?
Not me. In fact, I'd go further than just saying "I don't think it is worthwhile". I'll say that treating bare strings as pseudo-comments is a positive feature worth keeping. Tightening up the grammar to prohibit that is a bad thing. There's an argument to be made that bare expressions like: 100 are pointless, but it isn't a strong argument. In practice, it isn't really a common source of errors, and as far as efficiency goes, the peephole optimizer solves that. And its easy to get the rules wrong. For instance, at first I thought that a bare name lookup like: x could be safely optimized away, or prohibited, but it can't. It is true that a successful name lookup will do nothing, but not all lookups are successful: try: next except NameError: # Python version is too old def next(iterator): return iterator.next() If we prohibit bare name lookups, that will break a lot of working code. I suppose it is possible that a *sufficiently intelligent* compiler could recognise bare expressions that have no side-effects, and prohibit them, and that this might prevent some rare, occasional errors: x #= 1 # oops I meant += but honestly, I don't see that this is a good use of developer's time. It adds complexity to the language, risks false positives, and in my opinion is the sort of thing that is better flagged by a linter, not prohibited by the interpreter. -- Steve
participants (6)
-
Chris Angelico
-
Hugh Fisher
-
Pavol Lisy
-
Ryan Gonzalez
-
Serhiy Storchaka
-
Steven D'Aprano