Making stars optional? (was: Making colons optional?)

In algebra, you don't have to put a multiplication sign in between two quantities that you want to multiply. I've seen beginning programmers write things like x = 3a + 4(b-c) instead of x = 3*a + 4*(b-c) Why should we require the stars when it's unambiguous what the first statement means? --- Bruce P.S. Pascal used words if ... then ... and while ... do ... and begin ... end. I like non-alpha symbols that stand out better, like colons, braces and indentation. I don't need a semicolon to tell me where the end of a line is. I only need something to tell me where the end of a line ISN'T and I handle that with prominent indentation.

On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban <bruce@leapyear.org> wrote:
Because there's a /very/ high likelihood that it was typo, and per the Zen, Python shouldn't guess in the face of ambiguity and (likely) errors should never pass silently. Further, if it is a typo, it's not unambiguous enough that we can infer with certainty that multiplication was intended; it's just as likely the programmer forgot to type the operator, and there's only a 1 in 11 (or worse) (that's how many binary operators I could come up with without looking at the manual) chance that multiplication was indeed intended. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com

On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban <bruce@leapyear.org> wrote:
Sure, and given the following program: a = 2 b = 4 print ab shouldn't we be able to print "8", given that the meaning of the program is unambiguous? Ultimately, you have to balance ease-of-use against consistency -- both because too much inconsistency can actually harm ease-of-use and because "special cases" tend to combine in crazy ways to create horrible edge cases. Where to draw the line is always a matter of personal taste, but the Python language has consistently favored consistency in its philosophy. -- Curt Hagenlocher curt@hagenlocher.org

I apologize for leaving the :-) out in my original post. Just to be clear: (1) I indeed have seen this mistake from beginning programmers and (2) I think they should get over it. I also don't think that = and == should be the same syntax and trust the compiler to figure out which one you mean. I also don't think spaces should be optional, despite the fact that no space probe has ever been lost as a result of that feature. :-( http://my.safaribooksonline.com/0131774298/ch02 http://catless.ncl.ac.uk/Risks/9.54.html --- Bruce On Thu, Feb 5, 2009 at 11:04 AM, Curt Hagenlocher <curt@hagenlocher.org>wrote:
That would be so cool! If any variable is undefined, break it up into smaller variables and if they're numbers multiply them and if they're strings concatenate them. Wow!

Bruce Leban wrote:
Because it's never unambiguous. If I write x = 3a, does that mean that I've accidentally left out the + sign I intended, or that it is a multiplication? Worse, if I write x = a3, is that an assignment of a*3 or a variable named "a3"? Similar for x = ab. The tradition in mathematics is to use one letter variable names, so a variable called "ab" is so rare as to be virtually non-existent, but this obviously doesn't hold for programming. Mathematicians get away with this sort of ambiguity because they are writing for other mathematicians, not for a computer. Because mathematical proofs rely on a sequence of equations, not just a single statement, the ambiguity can be resolved: y = a(b+c) - ac # does this mean a+() or a*() or something else? y = ab + ac - ac # ah, it must have been a*() y = ab -- Steven

Steven D'Aprano schrieb:
No context is needed to know what a(b+c) means. In maths, you only have single-character variable names (sub-/superscripts notwithstanding), so ab always means a*b. Together with some other conventions, like that nobody writes a3 instead of 3a, everything is unambiguous. Though there *are* programmers who wouldn't notice if Python 3.0 switched to one-character-only variable names, they are probably a minority and hopefully dying out :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Only if you assume that the mathematician didn't make a typo. Any operator could appear between the a and the bracketed term: you're deducing the presence of an implied multiplication by the absence of an operator, which is the convention in mathematics, but it is an unsafe deduction if you have only a single line in isolation. It only becomes safe in context because any accidental omission of the operator should become obvious in the following line(s) of the proof. I'm not saying this as a criticism of mathematicians. They value brevity over the increased protection from typos, which is a valid choice to make. But it is not a choice available to Python, because we have multi-character variable names, and mathematical expressions stand alone in Python code, they aren't part of an extended proof.
Except in the presence of typos. This is a small risk for mathematicians, but a large risk for Python programmers. Not because mathematicians are better typists than Python programmers, but because there are more opportunities to catch the typo in a mathematical proof than there are in a program. -- Steven

On Fri, Feb 6, 2009 at 6:59 PM, Steven D'Aprano <steve@pearwood.info> wrote:
In the presence of typos all bets are off, unless you are aware of any typo-proof writing system. Python certainly isn't one since, say, "x.y" and "x,y" are pretty similar, both visually and in keyboard distance. George

George Sakkis wrote:
You overstate your case: not *all* bets are off, just some of them. Some typos have greater consequences than others. Some will be discovered earlier than others, and the earlier they are discovered, the more likely they are to be easily fixed without the need for significant debugging effort. E.g. if I type R**$ instead of R**4, such a typo will be picked up in Python immediately. But R**5 instead could be missed for arbitrarily large amounts of time. E.g. if you mean x.y but type x,y instead, then such an error will be discovered *very* soon, unless you happen to also have a name 'y'. But anyway, we're not actually disagreeing. (At least, I don't think we are.) We're just discussing how some conventions encourage errors and others discourage them, and the circumstances of each. -- Steven

On Feb 7, 2009, at 12:21 AM, Steven D'Aprano wrote:
This doesn't really makes much sense to me... if you don't use tests to verify your program both R**5 or R**A are just the same error if for example this code path is a rare case, it will only be discovered when something goes wrong with the program... And for a testable program is much more easily verifiable than a mathematical proof on paper (specially because our brain is used to fix little problems in reality to match what we think is right). Still I think that ab to mean a*b in python is silly, even if python was only the scripting language of Sage. -- Leonardo Santagada santagada at gmail.com

On Sat, 07 Feb 2009 12:19:50 -0200, Leonardo Santagada wrote:
If that piece of code is in a rarely walked path, then it is even worse since the bug could easily creep into production release without anyone noticing. It happens that there are three classifications of errors in programming: Syntax Error - easily found, impossible to creep into production release since compiler/interpreter would complaint loudly Runtime Error - found when the code path is walked, comprehensive testing would reveal such errors Logic Error - hard to find, hard to debug It happens that most typos would become SyntaxError and LogicError. Implicit multiplication currently becomes a SyntaxError, since python recognizes no implicit multiplication, OTOH if we introduce implicit multiplication, it'll invariably becomes Logic Error or Runtime Error.

Bruce Leban wrote:
Indeed. Quite obviously the programmer meant to call the number 4 with argument b - c there. :-) I believe that one of HP's programmable calculator/ computer thingies had a language that let you write implied multiplications like that, but it only had single-letter variable names and no function calls, so there wasn't so much of a problem. In the next model you were allowed multi-char variable names, so they had to drop that feature. BTW, I think in Icon you can actually call numbers, although I forget what weird-assed thing it means just at the moment. -- Greg

On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban <bruce@leapyear.org> wrote:
Because there's a /very/ high likelihood that it was typo, and per the Zen, Python shouldn't guess in the face of ambiguity and (likely) errors should never pass silently. Further, if it is a typo, it's not unambiguous enough that we can infer with certainty that multiplication was intended; it's just as likely the programmer forgot to type the operator, and there's only a 1 in 11 (or worse) (that's how many binary operators I could come up with without looking at the manual) chance that multiplication was indeed intended. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com

On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban <bruce@leapyear.org> wrote:
Sure, and given the following program: a = 2 b = 4 print ab shouldn't we be able to print "8", given that the meaning of the program is unambiguous? Ultimately, you have to balance ease-of-use against consistency -- both because too much inconsistency can actually harm ease-of-use and because "special cases" tend to combine in crazy ways to create horrible edge cases. Where to draw the line is always a matter of personal taste, but the Python language has consistently favored consistency in its philosophy. -- Curt Hagenlocher curt@hagenlocher.org

I apologize for leaving the :-) out in my original post. Just to be clear: (1) I indeed have seen this mistake from beginning programmers and (2) I think they should get over it. I also don't think that = and == should be the same syntax and trust the compiler to figure out which one you mean. I also don't think spaces should be optional, despite the fact that no space probe has ever been lost as a result of that feature. :-( http://my.safaribooksonline.com/0131774298/ch02 http://catless.ncl.ac.uk/Risks/9.54.html --- Bruce On Thu, Feb 5, 2009 at 11:04 AM, Curt Hagenlocher <curt@hagenlocher.org>wrote:
That would be so cool! If any variable is undefined, break it up into smaller variables and if they're numbers multiply them and if they're strings concatenate them. Wow!

Bruce Leban wrote:
Because it's never unambiguous. If I write x = 3a, does that mean that I've accidentally left out the + sign I intended, or that it is a multiplication? Worse, if I write x = a3, is that an assignment of a*3 or a variable named "a3"? Similar for x = ab. The tradition in mathematics is to use one letter variable names, so a variable called "ab" is so rare as to be virtually non-existent, but this obviously doesn't hold for programming. Mathematicians get away with this sort of ambiguity because they are writing for other mathematicians, not for a computer. Because mathematical proofs rely on a sequence of equations, not just a single statement, the ambiguity can be resolved: y = a(b+c) - ac # does this mean a+() or a*() or something else? y = ab + ac - ac # ah, it must have been a*() y = ab -- Steven

Steven D'Aprano schrieb:
No context is needed to know what a(b+c) means. In maths, you only have single-character variable names (sub-/superscripts notwithstanding), so ab always means a*b. Together with some other conventions, like that nobody writes a3 instead of 3a, everything is unambiguous. Though there *are* programmers who wouldn't notice if Python 3.0 switched to one-character-only variable names, they are probably a minority and hopefully dying out :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Only if you assume that the mathematician didn't make a typo. Any operator could appear between the a and the bracketed term: you're deducing the presence of an implied multiplication by the absence of an operator, which is the convention in mathematics, but it is an unsafe deduction if you have only a single line in isolation. It only becomes safe in context because any accidental omission of the operator should become obvious in the following line(s) of the proof. I'm not saying this as a criticism of mathematicians. They value brevity over the increased protection from typos, which is a valid choice to make. But it is not a choice available to Python, because we have multi-character variable names, and mathematical expressions stand alone in Python code, they aren't part of an extended proof.
Except in the presence of typos. This is a small risk for mathematicians, but a large risk for Python programmers. Not because mathematicians are better typists than Python programmers, but because there are more opportunities to catch the typo in a mathematical proof than there are in a program. -- Steven

On Fri, Feb 6, 2009 at 6:59 PM, Steven D'Aprano <steve@pearwood.info> wrote:
In the presence of typos all bets are off, unless you are aware of any typo-proof writing system. Python certainly isn't one since, say, "x.y" and "x,y" are pretty similar, both visually and in keyboard distance. George

George Sakkis wrote:
You overstate your case: not *all* bets are off, just some of them. Some typos have greater consequences than others. Some will be discovered earlier than others, and the earlier they are discovered, the more likely they are to be easily fixed without the need for significant debugging effort. E.g. if I type R**$ instead of R**4, such a typo will be picked up in Python immediately. But R**5 instead could be missed for arbitrarily large amounts of time. E.g. if you mean x.y but type x,y instead, then such an error will be discovered *very* soon, unless you happen to also have a name 'y'. But anyway, we're not actually disagreeing. (At least, I don't think we are.) We're just discussing how some conventions encourage errors and others discourage them, and the circumstances of each. -- Steven

On Feb 7, 2009, at 12:21 AM, Steven D'Aprano wrote:
This doesn't really makes much sense to me... if you don't use tests to verify your program both R**5 or R**A are just the same error if for example this code path is a rare case, it will only be discovered when something goes wrong with the program... And for a testable program is much more easily verifiable than a mathematical proof on paper (specially because our brain is used to fix little problems in reality to match what we think is right). Still I think that ab to mean a*b in python is silly, even if python was only the scripting language of Sage. -- Leonardo Santagada santagada at gmail.com

On Sat, 07 Feb 2009 12:19:50 -0200, Leonardo Santagada wrote:
If that piece of code is in a rarely walked path, then it is even worse since the bug could easily creep into production release without anyone noticing. It happens that there are three classifications of errors in programming: Syntax Error - easily found, impossible to creep into production release since compiler/interpreter would complaint loudly Runtime Error - found when the code path is walked, comprehensive testing would reveal such errors Logic Error - hard to find, hard to debug It happens that most typos would become SyntaxError and LogicError. Implicit multiplication currently becomes a SyntaxError, since python recognizes no implicit multiplication, OTOH if we introduce implicit multiplication, it'll invariably becomes Logic Error or Runtime Error.

Bruce Leban wrote:
Indeed. Quite obviously the programmer meant to call the number 4 with argument b - c there. :-) I believe that one of HP's programmable calculator/ computer thingies had a language that let you write implied multiplications like that, but it only had single-letter variable names and no function calls, so there wasn't so much of a problem. In the next model you were allowed multi-char variable names, so they had to drop that feature. BTW, I think in Icon you can actually call numbers, although I forget what weird-assed thing it means just at the moment. -- Greg
participants (10)
-
Bruce Leban
-
Chris Rebert
-
Curt Hagenlocher
-
Georg Brandl
-
George Sakkis
-
Greg Ewing
-
Leonardo Santagada
-
Lie Ryan
-
Steven D'Aprano
-
Terry Reedy