Indentation, again: Problems with moving and combining code across editors.

Hi! I do appreciate that indentation in Python code shows intention, and I am all for it, but problems do arise if code is copied from one place to another, even from specialized Python editors.... and there is little you can do to recover intentions/indentations lost in the process. Besides, spaces become difficult to count and distinguish from tabs, adding to the problem. (Why allow tabs at all?) Why can't Python be made to accept the following indentation of code (in *addition* to the current schemes)? There could be some flexibility of choosing the indentation character. (Pipe was mentioned as an alternative in another thread) for i in range(10): .if i>5: ..print 10-i .else: ..print i print "!" If not in the interpreter for some reason, could it be "advised" as a mode in a python oriented editor, or a "dictated" mechanism for copy/paste/transfer of code? Sorry for posing an elementary question. Thanks and regards MK-zedobject

On Fri, Aug 29, 2014 at 2:12 PM, Milind Khadilkar <zedobject@gmail.com> wrote:
It's a known issue. I wouldn't necessarily call it a problem. I don't know if there are good strategies for tackling that other than "don't do that." If you find yourself copying around blocks of text which don't correspond to whole functions or classes, it tells me you might have some refactoring to do. Tools like pylint can help identify common chunks of code between two or more files. I don't know what support there is in existing editors or IDEs for code refactoring. There used to be Bicycle Repair Man, but I think he's long gone: http://bicyclerepair.sourceforge.net/ There is also a library called ROPE: http://rope.sourceforge.net/ Never heard of it before. Don't know if it's currently supported (it at least has Py3K support). Skip

On 29/08/2014 20:12, Milind Khadilkar wrote:
It strikes me that these "problems" has existed for 23 years and somehow people have survived. Could it be a classic example of a bad workman always blames his tools? See Skip Montanaro's reply for an explanation as to why. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

On Aug 29, 2014, at 12:12, Milind Khadilkar <zedobject@gmail.com> wrote:
Hi! I do appreciate that indentation in Python code shows intention, and I am all for it, but problems do arise if code is copied from one place to another, even from specialized Python editors.... and there is little you can do to recover intentions/indentations lost in the process.
I copy and paste code between editors all the time--e.g., from emacs to the graphical editors on sites like Github or StackOverflow--and I can't remember the last time I've had this problem, except when copying someone else's code that mixed spaces and tabs (and even in that case, it more often helps me find existing invisible errors than it causes errors).
Besides, spaces become difficult to count and distinguish from tabs, adding to the problem. (Why allow tabs at all?)
That one I could get behind, but it's been suggested and rejected enough times to have a PEP assigned to it.
The first problem is trying to come up with a syntax that isn't ambiguous to the parser or to a human. You haven't succeeded there: .2 Is that the int literal `2` indented, or the float literal `.2` in the left column? Even the pipe character had this problem: |2 That pipe could mean the or operator if the previous line ended with a continuation character. But even discounting that, the tokenizer and any human experienced with Python will read it as the or operator and then have to follow some rule saying "an or operator at the start of a line that isn't part of a valid expression counts as a space". And that means guessing that it wasn't intended as an or operator, possibly turning what should have been identified as a simple SyntaxError into something baffling. I suppose you could suggest bringing back ` for this purpose, but I doubt anyone will like that.

Milind Khadilkar writes:
Besides, spaces become difficult to count and distinguish from tabs, adding to the problem. (Why allow tabs at all?)
Working code should not be broken gratuitously. Sure, it was an unfortunate decision in the first place, but we can't change that original decision now.
Why can't Python be made to accept the following indentation of code (in *addition* to the current schemes)?
Because we *already have it*: the character is ' '. If *you* follow that rule, you won't have problems copy/pasting your own code into well-behaved code. If you're working with somebody else's code which mixes tabs and spaces, *using an alternative character doesn't help*, whether the "bad" code is source or target.

Thanks. The problem is mostly when code from different people using different tools needs to be combined, or sample code from the internet (not from the well-behaved ones like github and stackoverflow) has to be tested and understood. Of course it is fun to discover the lost intentions of the original programmer, but it is not practical in a day job. QUOTE Because we *already have it*: the character is ' '. UNQUOTE The problem with ' ' is that a sequence of ' 's can't easily be counted, especially on feature rich editors. A sequence of '.' or some visible character can be counted. Of course, with the previous line being continued on the current line, '.' could present problems, as Andrew B. points out, and they can be messy. Could the solution lie in providing a "transfer" or "import-export" mode in editors where the indentation is given in terms of a chosen special character which is converted to spaces/tabs? Thanks again MK-Zedobject On Sat, Aug 30, 2014 at 11:02 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:

Milind Khadilkar <zedobject@gmail.com> writes:
The problem with ' ' is that a sequence of ' 's can't easily be counted, especially on feature rich editors.
You have a “feature rich” editor which is unable to count a sequence of characters? I don't think that qualifies as “feature rich” for purposes of programming. Also, your proposal then only seems to be valid if you have an editor which: * is capable of the highly specific task of “convert Python code between ‘ ’ for indentation and ‘.’ for indentation”; but * is so feature-poor that it *is not* capable of the generally-useful task “count a sequence of characters”. I think that's a text editor so bizarre we can discount it as irrelevant until someone is demonstrated to be using it seriously and can't be convinced to use a better editor. -- \ “If you ever catch on fire, try to avoid seeing yourself in the | `\ mirror, because I bet that's what REALLY throws you into a | _o__) panic.” —Jack Handey | Ben Finney

On Sat, Aug 30, 2014 at 11:49:41AM +0530, Milind Khadilkar wrote:
Are you suggesting that indentation is the *only* reason to test and understand arbitrary code you copy and paste from the Internet? I don't that's a reasonable argument to make. Whether it is your day job, or just a hobby, you should have some understanding of what the code is supposed to do before pasting it into your own work. And there is so much more than just indentation that you have to be concerned about: the imports, the variables it uses, which functions it calls, can the code be trusted. Indentation is the least of these. I accept that there are occasional situations where we might want to copy code from a website or email, and the indentation has been lost. But that ought to be rare. In my personal experience, I've needed to reconstruct the indentation from scratch perhaps three or four times in the last ten years. Even if I'm off by a factor of ten, let's call it three or four times a year. That's still not very important. Worrying that this is "not practical" in a day job seems to be worrying over nothing.
I don't think that's a reasonable argument. I trust you're not using Notepad, are you? In a bare-bones editor, it's hard to count *any* long sequence of characters. Quick, how many dots before the X? ..................X If you are a professional programmer, or even just a serious amateur, you should be using professional tools. With professional tools, you rarely need to *count the spaces*. You just need to notice changes to the indent level: one level another level different level You don't need to care precisely how many spaces (or tabs) are there, you just need to ensure things line up. And any decent programming editor will give you the tools to increase or decrease indentation over a block of lines, without caring about the specific number of spaces or counting exactly how many spaces are needed. That includes a wide range of IDEs and editors of all sorts of power, on many different platforms: Notepad++ or geany on Windows, kate or kwrite on KDE, vim and emacs for pretty much any Unix or Linux system, to mention only a few. Which brings us to another problem with your suggestion. Professional editors already know about indentation with tabs, and spaces, but I don't know any editor which gives you the ability to indent with arbitrary characters. (Although I daresay somebody would soon write a macro for Emacs to do that.) Which means you are swapping from a system where professional-quality programming tools can do the counting for you, to a situation where you actually do need to count the dots yourself. And that's a big step backwards. I don't think this change is practical, or useful. And it's ugly. -- Steven

On 08/30/2014 01:19 AM, Milind Khadilkar wrote:
Unfortunately there probably isn't much you can do about uses of tabs or spaces outside your immediate team. If they are all are part of your team, probably the best approach is to agree to set your editors to hi-light tabs in an obvious way, and also set them to use spaces in place of tabs. Over time you should see less uses of tabs and it won't be as much of an issue. As for cutting and pasting lines, you will still probably need to reindent anyway. For that I use the indent, dedent capability of the editor if it has one, or a different editor that has that, if it doesn't. It's one of my must have requirements. I don't think an explicit indent character would help much, but possibly a relative indent marker might be useful for putting multi-line code on a single line. Currently we have the ";" which only keeps the current indentation. It probably wouldn't be that hard to add ";+", and ";-", symbols (or something equivalent), but I think it would need it's own pep and discussion if anyone cares enough about it. Probably the biggest issue is "is it needed?", so it would need some convincing examples of how it would be useful. -Ron

On 8/30/2014 11:36 AM, Ron Adam wrote:
Unfortunately there probably isn't much you can do about uses of tabs or spaces outside your immediate team.
Except to convert to local standard.
Hmmm. Idle does not have an option to do that, but it would be useful. -- Terry Jan Reedy

On Fri, Aug 29, 2014 at 2:12 PM, Milind Khadilkar <zedobject@gmail.com> wrote:
It's a known issue. I wouldn't necessarily call it a problem. I don't know if there are good strategies for tackling that other than "don't do that." If you find yourself copying around blocks of text which don't correspond to whole functions or classes, it tells me you might have some refactoring to do. Tools like pylint can help identify common chunks of code between two or more files. I don't know what support there is in existing editors or IDEs for code refactoring. There used to be Bicycle Repair Man, but I think he's long gone: http://bicyclerepair.sourceforge.net/ There is also a library called ROPE: http://rope.sourceforge.net/ Never heard of it before. Don't know if it's currently supported (it at least has Py3K support). Skip

On 29/08/2014 20:12, Milind Khadilkar wrote:
It strikes me that these "problems" has existed for 23 years and somehow people have survived. Could it be a classic example of a bad workman always blames his tools? See Skip Montanaro's reply for an explanation as to why. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

On Aug 29, 2014, at 12:12, Milind Khadilkar <zedobject@gmail.com> wrote:
Hi! I do appreciate that indentation in Python code shows intention, and I am all for it, but problems do arise if code is copied from one place to another, even from specialized Python editors.... and there is little you can do to recover intentions/indentations lost in the process.
I copy and paste code between editors all the time--e.g., from emacs to the graphical editors on sites like Github or StackOverflow--and I can't remember the last time I've had this problem, except when copying someone else's code that mixed spaces and tabs (and even in that case, it more often helps me find existing invisible errors than it causes errors).
Besides, spaces become difficult to count and distinguish from tabs, adding to the problem. (Why allow tabs at all?)
That one I could get behind, but it's been suggested and rejected enough times to have a PEP assigned to it.
The first problem is trying to come up with a syntax that isn't ambiguous to the parser or to a human. You haven't succeeded there: .2 Is that the int literal `2` indented, or the float literal `.2` in the left column? Even the pipe character had this problem: |2 That pipe could mean the or operator if the previous line ended with a continuation character. But even discounting that, the tokenizer and any human experienced with Python will read it as the or operator and then have to follow some rule saying "an or operator at the start of a line that isn't part of a valid expression counts as a space". And that means guessing that it wasn't intended as an or operator, possibly turning what should have been identified as a simple SyntaxError into something baffling. I suppose you could suggest bringing back ` for this purpose, but I doubt anyone will like that.

Milind Khadilkar writes:
Besides, spaces become difficult to count and distinguish from tabs, adding to the problem. (Why allow tabs at all?)
Working code should not be broken gratuitously. Sure, it was an unfortunate decision in the first place, but we can't change that original decision now.
Why can't Python be made to accept the following indentation of code (in *addition* to the current schemes)?
Because we *already have it*: the character is ' '. If *you* follow that rule, you won't have problems copy/pasting your own code into well-behaved code. If you're working with somebody else's code which mixes tabs and spaces, *using an alternative character doesn't help*, whether the "bad" code is source or target.

Thanks. The problem is mostly when code from different people using different tools needs to be combined, or sample code from the internet (not from the well-behaved ones like github and stackoverflow) has to be tested and understood. Of course it is fun to discover the lost intentions of the original programmer, but it is not practical in a day job. QUOTE Because we *already have it*: the character is ' '. UNQUOTE The problem with ' ' is that a sequence of ' 's can't easily be counted, especially on feature rich editors. A sequence of '.' or some visible character can be counted. Of course, with the previous line being continued on the current line, '.' could present problems, as Andrew B. points out, and they can be messy. Could the solution lie in providing a "transfer" or "import-export" mode in editors where the indentation is given in terms of a chosen special character which is converted to spaces/tabs? Thanks again MK-Zedobject On Sat, Aug 30, 2014 at 11:02 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:

Milind Khadilkar <zedobject@gmail.com> writes:
The problem with ' ' is that a sequence of ' 's can't easily be counted, especially on feature rich editors.
You have a “feature rich” editor which is unable to count a sequence of characters? I don't think that qualifies as “feature rich” for purposes of programming. Also, your proposal then only seems to be valid if you have an editor which: * is capable of the highly specific task of “convert Python code between ‘ ’ for indentation and ‘.’ for indentation”; but * is so feature-poor that it *is not* capable of the generally-useful task “count a sequence of characters”. I think that's a text editor so bizarre we can discount it as irrelevant until someone is demonstrated to be using it seriously and can't be convinced to use a better editor. -- \ “If you ever catch on fire, try to avoid seeing yourself in the | `\ mirror, because I bet that's what REALLY throws you into a | _o__) panic.” —Jack Handey | Ben Finney

On Sat, Aug 30, 2014 at 11:49:41AM +0530, Milind Khadilkar wrote:
Are you suggesting that indentation is the *only* reason to test and understand arbitrary code you copy and paste from the Internet? I don't that's a reasonable argument to make. Whether it is your day job, or just a hobby, you should have some understanding of what the code is supposed to do before pasting it into your own work. And there is so much more than just indentation that you have to be concerned about: the imports, the variables it uses, which functions it calls, can the code be trusted. Indentation is the least of these. I accept that there are occasional situations where we might want to copy code from a website or email, and the indentation has been lost. But that ought to be rare. In my personal experience, I've needed to reconstruct the indentation from scratch perhaps three or four times in the last ten years. Even if I'm off by a factor of ten, let's call it three or four times a year. That's still not very important. Worrying that this is "not practical" in a day job seems to be worrying over nothing.
I don't think that's a reasonable argument. I trust you're not using Notepad, are you? In a bare-bones editor, it's hard to count *any* long sequence of characters. Quick, how many dots before the X? ..................X If you are a professional programmer, or even just a serious amateur, you should be using professional tools. With professional tools, you rarely need to *count the spaces*. You just need to notice changes to the indent level: one level another level different level You don't need to care precisely how many spaces (or tabs) are there, you just need to ensure things line up. And any decent programming editor will give you the tools to increase or decrease indentation over a block of lines, without caring about the specific number of spaces or counting exactly how many spaces are needed. That includes a wide range of IDEs and editors of all sorts of power, on many different platforms: Notepad++ or geany on Windows, kate or kwrite on KDE, vim and emacs for pretty much any Unix or Linux system, to mention only a few. Which brings us to another problem with your suggestion. Professional editors already know about indentation with tabs, and spaces, but I don't know any editor which gives you the ability to indent with arbitrary characters. (Although I daresay somebody would soon write a macro for Emacs to do that.) Which means you are swapping from a system where professional-quality programming tools can do the counting for you, to a situation where you actually do need to count the dots yourself. And that's a big step backwards. I don't think this change is practical, or useful. And it's ugly. -- Steven

On 08/30/2014 01:19 AM, Milind Khadilkar wrote:
Unfortunately there probably isn't much you can do about uses of tabs or spaces outside your immediate team. If they are all are part of your team, probably the best approach is to agree to set your editors to hi-light tabs in an obvious way, and also set them to use spaces in place of tabs. Over time you should see less uses of tabs and it won't be as much of an issue. As for cutting and pasting lines, you will still probably need to reindent anyway. For that I use the indent, dedent capability of the editor if it has one, or a different editor that has that, if it doesn't. It's one of my must have requirements. I don't think an explicit indent character would help much, but possibly a relative indent marker might be useful for putting multi-line code on a single line. Currently we have the ";" which only keeps the current indentation. It probably wouldn't be that hard to add ";+", and ";-", symbols (or something equivalent), but I think it would need it's own pep and discussion if anyone cares enough about it. Probably the biggest issue is "is it needed?", so it would need some convincing examples of how it would be useful. -Ron

On 8/30/2014 11:36 AM, Ron Adam wrote:
Unfortunately there probably isn't much you can do about uses of tabs or spaces outside your immediate team.
Except to convert to local standard.
Hmmm. Idle does not have an option to do that, but it would be useful. -- Terry Jan Reedy
participants (9)
-
Andrew Barnert
-
Ben Finney
-
Mark Lawrence
-
Milind Khadilkar
-
Ron Adam
-
Skip Montanaro
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy