Reducing colon uses to increase readability

Hi everyone, In Python, the humble colon (:) has multiples uses: 1. as a signal to indentation increase, signaling a block of code, such as 1a) for function or class definitions 1b) for while/for/if/elif/else blocks 1c) for try/except/finally blocks In these cases, the majority opinion (to which I subscribe) is that using a colon increases readability. I am NOT suggesting to removing the colon in those instances. However, the colon has also some other uses. 2. in slices [a:b:c] 3. in dict assignments {a:b} 4. in lambda assignments (lambda x: x+1) I would argue that, in these last three examples, there might be better choices. (some of these choices have been inspired by reading http://www.resolversystems.com/documentation/index.php/Differences_Between_R...) I don't expect the following suggestions to immediately convince everyone (or anyone!) ... but, at least they will be on record. Slices: --------- I would argue that, the usual slicing notation would be more readable if it were as follows: [a -> b; c] Thus [1:10:2] would become [1 -> 10; 2] [1:10] would become [1 -> 10] The "shorter" combinations would not gain in terms of readability; they would be as follows: [ :10 : 2] would become [10; 2] [ :10] would become [10;] [:: -1] would become [; -1] [:] would become [;] If such a change were to be made, an second slicing notation, *with a different meaning*, could be introduced: [a => b; c] This would be an inclusive range, i.e. [a => b] is equivalent to [a -> b+1] dict assignments ------------------------ Here again, I would argue that using "->" instead of ":" would make the code more readable - at least for beginners. numbers = {'one' -> 1, 'two' -> 2} instead of numbers = {'one': 1, 'two': 2} lambda assignments --------------------------- Once again, same choice. lambda x -> x+1 is, I think, more readable than lambda x: x+1 (but perhaps the last two [dicts and lambda] largely depends on the font choice...) ====== Other considerations: If "->" were to be adopted for dict or lambda assignments, then the "naturalness" of their choice for slices would be reduced. An alternative might be inspired from the mathematical notation [a, ..., b; c] I realize that this is "much" longer than [a: b: c]. Final comment: I have seen other alternatives for simple slices suggested in the past such as [a..b] and [a...b] which would be the equivalent of [a->b] and [a=>b]; however, the extra "." might sometimes be difficult to read, whereas the difference between "->" and "=>" is much easier to see. Cheers, André

Andre Roberge wrote:
It is a bit late in Python's career to make such changes, which would break nearly all substantial programs for at best a small visual gain. -> is slightly harder to type than : and to me uglier. Any new use of ';' has to neither conflict with its current use nor introduce ambiguities that would push Python out of its current LL(1) (I believe it is) grammar class. 'key: item' comports with 'keyword-or-phrase: explanation' constructions in English. lambda expressions abbreviate def statements: def name(args): return expression => lambda args: expression The ':' separates header and body in both. I agree that slices could have used something else, but.... don't hold your breath for a code-breaking change now. Terry Jan Reedy

On Mon, Jun 30, 2008 at 1:01 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Thanks for the information; I've learned something new. [snip]
I agree that slices could have used something else, but....
don't hold your breath for a code-breaking change now.
I wasn't... I was thinking that, if any of these ideas were to be seen to have some merit, they would make their way in around Python 3.8 (10 to 12 years from now ;-) André

The only place where I think : could be problematic is slicing. Other than that I don't see any problem. maybe this syntax would be better? sequence[start..end,step] sequence[start..,step] sequence[..end,step] sequence[start..end] ... Well, but the , would be problematic. How to distinguish between the tuple ((start..end),step) and the slice object (start..end,step)? So this syntax isn't a good idea either. However, I think ".." is much better than ":". But changing this syntax is way to problematic. This should have been done/thought about before the syntax was introduced. Now it's to late anyway. (And the current syntax isn't *that* bad.) -panzi

2008/6/30 Mathias Panzenböck <grosser.meister.morti@gmx.net>:
Well, but the , would be problematic. How to distinguish between the tuple ((start..end),step) and the slice object (start..end,step)?
This is not problematic, as you just showed: ((start..end),step) is a tuple, (start..end,step) is a slice. -- Marcin Kowalczyk qrczak@knm.org.pl http://qrnik.knm.org.pl/~qrczak/

Terry Reedy wrote:
I agree that slices could have used something else, but....
Well there is always the slice object. slice(start, stop, step) Maybe if slice was more interchangeable with range or xrange, or if range objects could be used in place of slice objects? <shrug> A few results form Python 2.5:
s = slice(10, 20, 3)
range(100)[s] [10, 13, 16, 19]
list(xrange(100))[s] [10, 13, 16, 19]

On Sun, Jun 29, 2008 at 6:41 PM, Andre Roberge <andre.roberge@gmail.com> wrote:
I agree this isn't the clearest it could be.
3. in dict assignments {a:b}
This mirrors a number of existing languages, including English, and, more or less (depending on your priorities) important, JSON. It's always been comforting that valid Python structures (in fact, simply printed) are instantly valid JSON (for the most part...with ints and strings and such).
4. in lambda assignments (lambda x: x+1)
This (correctly) mirrors a standard function definition. the only way it could be closer is with something like lambda(x): x+1, but then this is not what's up for debate.
Are you suggesting this because you work with both languages? This e-mail seems a bit self-serving, because of the inclusion of someone's in-house language spec.
I am always very wary of multiple-character symbols. They are harder to type, harder to read, harder to parse (in a compiler or an editor), and open the language up to an unbounded number of (dare I say it?) Perlisms. That said, I'm not sure 'arrows' are even the right approach for slices. Slices should be thought of as ranges, which usually lend themselves to ellipses. I remember (loosely, from a long time ago) Ruby having '..' and '...' as exclusive and inclusive ranges, and I really liked that. With regard to the third item in a slice, the increment value, I almost never use it, because it seems to make code a lot harder to read clearly. If I feel the need to use it, it's usually a good indicator that I need to restructure my code, and if it's absolutely necessary, I'll typically just iterate over the list with a for loop so that I can understand what I was doing when I come back. If my half-suggestion of ellipses were taken up, I'd say that the colon could stay as the separator between the second and third arguments (and, as someone said already, the semi-colon introduces some weird parsing problems and possible ambiguities).
Like I said before, the colon is a widely-accepted way to separate keys and values in a dict. The only strange case I can see with this is something like: functions = {'plus': lambda x, y: x+y, 'minus': lambda x, y: x-y} In fact, I'm not sure if this _is_ legal python, so before running it, I'd just parenthesize out the lambda expressions to be sure anyway, and this clears everything up nicely: functions = {'plus': (lambda x, y: x+y), 'minus': (lambda x, y: x-y)}
As a pseudo-mathematician (and a recent student of Erlang), this is quite appealing, for a few reasons. First, let me say that the obvious "f(x) -> x**2 shows up all over math" is not the correct reason to say this is correct notation for functions. Python functions are procedures, not expressions (as they are in Erlang and Haskell, where the arrow-notation is commonplace). As such, a colon separating the function's name from its definition makes perfect sense, as this is the way we write English all the time, and I've seen more than one professor write pseudocode just like this. However, lambda functions _are_ single-expressions, not blocks. This leads me to believe that the arrow could be a good delimiter (except for my above statement that multiple-character symbols suck). Unfortunately for the arrow, it seems that priority in Python syntax is given to consistency within itself, rather than consistency with the outside world, so the fact that "lambda x: x**2" is consistent with "def sq(x): x**2" probably pulls more weight. Let me just say that putting something like the arrow (especially if we ever allow non-ASCII characters into the syntax) in lambda expressions would not be totally distasteful to me.
You're right. This is one of the reasons I hate Ruby. Yet another reason to ignore your suggestion for slices :-). -- Cheers, Leif

On Mon, Jun 30, 2008 at 6:34 PM, Leif Walsh <leif.walsh@gmail.com> wrote:
Nope, never used it. I try, whenever I can, to always give credit to the relevant source when I mention something that may appear to be an original idea - hence the above reference. [snip]
[snip]
Hmm... this is one of the reason you really like it (see above) and hate it too! ;-) Sorry, I couldn't resist ;-) Cheers, André

On Mon, Jun 30, 2008 at 3:00 PM, Andre Roberge <andre.roberge@gmail.com> wrote:
Teach me a lesson inconsistency, will you! Yeah, what I meant was that the math side of me really liked using ellipses to mark ranges, but the reading side of me hated figuring out the difference between '..' and '...'. I'd say forget about the inclusive/exclusive part, keep that the way it is now, and just change the colon to '...', if we decide to change anything about slices, which I'm not convinced is necessary. -- Cheers, Leif

Actually, I found : very clear as a newcomer to Python -- and you might laugh at me for this -- because it corresponded with my visualization of slices as, literally, vertical 'cuts' in the list. : is symmetrical and straight up-and-down, and for whatever reason this is what works for me when trying to visualize the slice. I took on a job tutoring Python to students at Berkeley over the past two semesters, using this analogy (: doing the 'cutting': i.e. start at index 'start', cut, and take the rest until index 'end') worked, and : never got in the way. Using -> or .. would possibly help the mathematically-oriented people, but we get a lot of people from different walks of life, and I don't know that -> or .. would really help. Also, the difference between -> and => is one line and a whole lot of confusion. Given the ability to use ranges that may or may not include the last element, that means for basically *every* given tuple, you can slice it the same way with two different notations (by adding one to the exclusive one, or subtracting one from the inclusive one). It's a terrible idea, because now you have two ways to do the same thing, and a terrible symbol to differentiate them: -> and => are both arrows, and there is no way I'm going to remember which is which if I'm not constantly using them -- and certainly not non-programming types. "There should be one-- and preferably only one --obvious way to do it." -- or else you get a bunch of conversations like so: "well, you did it this way, why didn't you do it that way?" "well, you can do it either way, actually." "so they mean the same thing?" "not quite, one's inclusive, one's exclusive, like I explained." "which one was which again?" "sigh" rinse and repeat. I really don't see the problem with ':'. It's straightforward enough that even humanities majors (ahem, engineering major, forgive the snobbishness) get used to it after 10-15 minutes of face-to-face tutoring and examples. Changing the notation would at best save 5 minutes of tutoring time and piss off everyone who is either used to colon notation or actually *likes* it (like me), and at worst, add on a few extra minutes and *still* piss off everyone who is either used to colon notation or actually likes it. Not to mention you'd have to change the syntax and render all the code that uses slicing useless, pissing off untold legions of Python programmers. That is, unless you make it an optional syntax alongside the original, in which case you again have the (IMO, unpleasant) situation of being able to express the same thing in many ways. You'll have code that gets mixed together which uses one or the other, so now you're forcing people to know both ways of doing the same thing and being able to switch between both syntaxes when reviewing the code. Bad bad bad. I just can't see this happening. I don't believe it has any merits except to possibly increase readability, and I take issue with the idea that it would in fact do so. Add to that the huge pain this would be either as a replacement syntax or as an optional syntax, and I think it would actually be a terrible idea. --Andy

Andrew Toulouse wrote:
On the contrary, I have advised people for years (in c.l.p posts) to realize that 0,1,...n-1,n number the n+1 slice positions before, between, and after the n items, which correspond to the possible positions of vertical bar cursors.
Thanks for pointing out that : is as close to | as possible in essential characteristics without being |. (The latter, of course, already being used for bitwise not and in other languages, doubled for logical not.) tjr

Arnaud Delobelle wrote:
On 30 Jun 2008, at 02:41, Andre Roberge wrote:
While I'd hate -> or <- being used in Python syntax (regardless where) (I think, := for assignment might be less evil). However, changing the topic, the idea of dropping lambda in some cases was already raised: http://www.python.org/dev/peps/pep-0312/ (still deferred). The implicit lambda is then more like a quote in Lisp, roughly: "things which aren't yet evaluated". I understood that the PEP312 was perceived as a partial case for the inline if-then-else and soon forgotten (when inline if made it into Python). Maybe its time to look at those lambdas again to see if there is some value in the lambdas without the word lambda? (N.B. The things has been already discussed couple of times and even backed by some Python developers: http://mail.python.org/pipermail/python-dev/2005-June/054303.html and even some later discussions occured: http://osdir.com/ml/python.python-3000.devel/2006-05/msg00773.html The main idea of implicit lambda is to better support lazy evaluations. However, beyond obvious simple cases omitting lambda makes code less readable. Regards, Roman

Andre Roberge wrote:
It is a bit late in Python's career to make such changes, which would break nearly all substantial programs for at best a small visual gain. -> is slightly harder to type than : and to me uglier. Any new use of ';' has to neither conflict with its current use nor introduce ambiguities that would push Python out of its current LL(1) (I believe it is) grammar class. 'key: item' comports with 'keyword-or-phrase: explanation' constructions in English. lambda expressions abbreviate def statements: def name(args): return expression => lambda args: expression The ':' separates header and body in both. I agree that slices could have used something else, but.... don't hold your breath for a code-breaking change now. Terry Jan Reedy

On Mon, Jun 30, 2008 at 1:01 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Thanks for the information; I've learned something new. [snip]
I agree that slices could have used something else, but....
don't hold your breath for a code-breaking change now.
I wasn't... I was thinking that, if any of these ideas were to be seen to have some merit, they would make their way in around Python 3.8 (10 to 12 years from now ;-) André

The only place where I think : could be problematic is slicing. Other than that I don't see any problem. maybe this syntax would be better? sequence[start..end,step] sequence[start..,step] sequence[..end,step] sequence[start..end] ... Well, but the , would be problematic. How to distinguish between the tuple ((start..end),step) and the slice object (start..end,step)? So this syntax isn't a good idea either. However, I think ".." is much better than ":". But changing this syntax is way to problematic. This should have been done/thought about before the syntax was introduced. Now it's to late anyway. (And the current syntax isn't *that* bad.) -panzi

2008/6/30 Mathias Panzenböck <grosser.meister.morti@gmx.net>:
Well, but the , would be problematic. How to distinguish between the tuple ((start..end),step) and the slice object (start..end,step)?
This is not problematic, as you just showed: ((start..end),step) is a tuple, (start..end,step) is a slice. -- Marcin Kowalczyk qrczak@knm.org.pl http://qrnik.knm.org.pl/~qrczak/

Terry Reedy wrote:
I agree that slices could have used something else, but....
Well there is always the slice object. slice(start, stop, step) Maybe if slice was more interchangeable with range or xrange, or if range objects could be used in place of slice objects? <shrug> A few results form Python 2.5:
s = slice(10, 20, 3)
range(100)[s] [10, 13, 16, 19]
list(xrange(100))[s] [10, 13, 16, 19]

On Sun, Jun 29, 2008 at 6:41 PM, Andre Roberge <andre.roberge@gmail.com> wrote:
I agree this isn't the clearest it could be.
3. in dict assignments {a:b}
This mirrors a number of existing languages, including English, and, more or less (depending on your priorities) important, JSON. It's always been comforting that valid Python structures (in fact, simply printed) are instantly valid JSON (for the most part...with ints and strings and such).
4. in lambda assignments (lambda x: x+1)
This (correctly) mirrors a standard function definition. the only way it could be closer is with something like lambda(x): x+1, but then this is not what's up for debate.
Are you suggesting this because you work with both languages? This e-mail seems a bit self-serving, because of the inclusion of someone's in-house language spec.
I am always very wary of multiple-character symbols. They are harder to type, harder to read, harder to parse (in a compiler or an editor), and open the language up to an unbounded number of (dare I say it?) Perlisms. That said, I'm not sure 'arrows' are even the right approach for slices. Slices should be thought of as ranges, which usually lend themselves to ellipses. I remember (loosely, from a long time ago) Ruby having '..' and '...' as exclusive and inclusive ranges, and I really liked that. With regard to the third item in a slice, the increment value, I almost never use it, because it seems to make code a lot harder to read clearly. If I feel the need to use it, it's usually a good indicator that I need to restructure my code, and if it's absolutely necessary, I'll typically just iterate over the list with a for loop so that I can understand what I was doing when I come back. If my half-suggestion of ellipses were taken up, I'd say that the colon could stay as the separator between the second and third arguments (and, as someone said already, the semi-colon introduces some weird parsing problems and possible ambiguities).
Like I said before, the colon is a widely-accepted way to separate keys and values in a dict. The only strange case I can see with this is something like: functions = {'plus': lambda x, y: x+y, 'minus': lambda x, y: x-y} In fact, I'm not sure if this _is_ legal python, so before running it, I'd just parenthesize out the lambda expressions to be sure anyway, and this clears everything up nicely: functions = {'plus': (lambda x, y: x+y), 'minus': (lambda x, y: x-y)}
As a pseudo-mathematician (and a recent student of Erlang), this is quite appealing, for a few reasons. First, let me say that the obvious "f(x) -> x**2 shows up all over math" is not the correct reason to say this is correct notation for functions. Python functions are procedures, not expressions (as they are in Erlang and Haskell, where the arrow-notation is commonplace). As such, a colon separating the function's name from its definition makes perfect sense, as this is the way we write English all the time, and I've seen more than one professor write pseudocode just like this. However, lambda functions _are_ single-expressions, not blocks. This leads me to believe that the arrow could be a good delimiter (except for my above statement that multiple-character symbols suck). Unfortunately for the arrow, it seems that priority in Python syntax is given to consistency within itself, rather than consistency with the outside world, so the fact that "lambda x: x**2" is consistent with "def sq(x): x**2" probably pulls more weight. Let me just say that putting something like the arrow (especially if we ever allow non-ASCII characters into the syntax) in lambda expressions would not be totally distasteful to me.
You're right. This is one of the reasons I hate Ruby. Yet another reason to ignore your suggestion for slices :-). -- Cheers, Leif

On Mon, Jun 30, 2008 at 6:34 PM, Leif Walsh <leif.walsh@gmail.com> wrote:
Nope, never used it. I try, whenever I can, to always give credit to the relevant source when I mention something that may appear to be an original idea - hence the above reference. [snip]
[snip]
Hmm... this is one of the reason you really like it (see above) and hate it too! ;-) Sorry, I couldn't resist ;-) Cheers, André

On Mon, Jun 30, 2008 at 3:00 PM, Andre Roberge <andre.roberge@gmail.com> wrote:
Teach me a lesson inconsistency, will you! Yeah, what I meant was that the math side of me really liked using ellipses to mark ranges, but the reading side of me hated figuring out the difference between '..' and '...'. I'd say forget about the inclusive/exclusive part, keep that the way it is now, and just change the colon to '...', if we decide to change anything about slices, which I'm not convinced is necessary. -- Cheers, Leif

Actually, I found : very clear as a newcomer to Python -- and you might laugh at me for this -- because it corresponded with my visualization of slices as, literally, vertical 'cuts' in the list. : is symmetrical and straight up-and-down, and for whatever reason this is what works for me when trying to visualize the slice. I took on a job tutoring Python to students at Berkeley over the past two semesters, using this analogy (: doing the 'cutting': i.e. start at index 'start', cut, and take the rest until index 'end') worked, and : never got in the way. Using -> or .. would possibly help the mathematically-oriented people, but we get a lot of people from different walks of life, and I don't know that -> or .. would really help. Also, the difference between -> and => is one line and a whole lot of confusion. Given the ability to use ranges that may or may not include the last element, that means for basically *every* given tuple, you can slice it the same way with two different notations (by adding one to the exclusive one, or subtracting one from the inclusive one). It's a terrible idea, because now you have two ways to do the same thing, and a terrible symbol to differentiate them: -> and => are both arrows, and there is no way I'm going to remember which is which if I'm not constantly using them -- and certainly not non-programming types. "There should be one-- and preferably only one --obvious way to do it." -- or else you get a bunch of conversations like so: "well, you did it this way, why didn't you do it that way?" "well, you can do it either way, actually." "so they mean the same thing?" "not quite, one's inclusive, one's exclusive, like I explained." "which one was which again?" "sigh" rinse and repeat. I really don't see the problem with ':'. It's straightforward enough that even humanities majors (ahem, engineering major, forgive the snobbishness) get used to it after 10-15 minutes of face-to-face tutoring and examples. Changing the notation would at best save 5 minutes of tutoring time and piss off everyone who is either used to colon notation or actually *likes* it (like me), and at worst, add on a few extra minutes and *still* piss off everyone who is either used to colon notation or actually likes it. Not to mention you'd have to change the syntax and render all the code that uses slicing useless, pissing off untold legions of Python programmers. That is, unless you make it an optional syntax alongside the original, in which case you again have the (IMO, unpleasant) situation of being able to express the same thing in many ways. You'll have code that gets mixed together which uses one or the other, so now you're forcing people to know both ways of doing the same thing and being able to switch between both syntaxes when reviewing the code. Bad bad bad. I just can't see this happening. I don't believe it has any merits except to possibly increase readability, and I take issue with the idea that it would in fact do so. Add to that the huge pain this would be either as a replacement syntax or as an optional syntax, and I think it would actually be a terrible idea. --Andy

Andrew Toulouse wrote:
On the contrary, I have advised people for years (in c.l.p posts) to realize that 0,1,...n-1,n number the n+1 slice positions before, between, and after the n items, which correspond to the possible positions of vertical bar cursors.
Thanks for pointing out that : is as close to | as possible in essential characteristics without being |. (The latter, of course, already being used for bitwise not and in other languages, doubled for logical not.) tjr

Arnaud Delobelle wrote:
On 30 Jun 2008, at 02:41, Andre Roberge wrote:
While I'd hate -> or <- being used in Python syntax (regardless where) (I think, := for assignment might be less evil). However, changing the topic, the idea of dropping lambda in some cases was already raised: http://www.python.org/dev/peps/pep-0312/ (still deferred). The implicit lambda is then more like a quote in Lisp, roughly: "things which aren't yet evaluated". I understood that the PEP312 was perceived as a partial case for the inline if-then-else and soon forgotten (when inline if made it into Python). Maybe its time to look at those lambdas again to see if there is some value in the lambdas without the word lambda? (N.B. The things has been already discussed couple of times and even backed by some Python developers: http://mail.python.org/pipermail/python-dev/2005-June/054303.html and even some later discussions occured: http://osdir.com/ml/python.python-3000.devel/2006-05/msg00773.html The main idea of implicit lambda is to better support lazy evaluations. However, beyond obvious simple cases omitting lambda makes code less readable. Regards, Roman
participants (9)
-
Andre Roberge
-
Andrew Toulouse
-
Arnaud Delobelle
-
Leif Walsh
-
Marcin ‘Qrczak’ Kowalczyk
-
Mathias Panzenböck
-
Roman Susi
-
Ron Adam
-
Terry Reedy