Re: [Python-ideas] 80 character line width vs. something wider
![](https://secure.gravatar.com/avatar/e6bed56358704c2503b46ce2690a2039.jpg?s=120&d=mm&r=g)
On Tue, May 19, 2009 at 10:51 AM, Mike Meyer <mwm@mired.org> wrote:
On May 19, 2009, at 12:43, Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
I realize that this is a religious debate which has gone on for many centuries. I appeal to the scientific aspects, with a distinct avoidance of preference and emotion. Preference might be easily explained by "right brain" vs "left brain" preference, but either way, it is merely a preference and I want to stick to facts. Here is a list I have compiled of facts which support a wider than 80 character line width standard (in Python, specifically). Please add to them, subtract from them, or add to the other side of the debate, but please avoid the usage of the word "readable" (which implies preference), unless you are referring to a scientific study of readability, which would be great.
Most of you points don't stand up under inestigation. Of course, negating them doesn't support am 80 character limit by itself.
1) Python is three things which the standard was not designed for: One: Object Oriented. Two: Not Hungarian notation Three: Mandatorily uses *whitespace* as its defintion for code blocks. Let me explain each one in a bit more detail: Object Oriented: Because it is not functional-style programming, but instead OO, you have to give defintion as to what object type you are using before using it. This makes definitions and usage longer than in functional programming (when 80 character widths were invented). PhazeMonkey.Hardware.FrameSource.Abstract.Framegrabber is an example (and not an extreme one) of a class (55 characters already) in a rather large code base.
This type of reference is considered by some to be bad style. See the "Law of Demeter" for more information.
The Law of Demeter applies to objects referenced second-hand. The class name given is an example of a hierarchy of modules, not one class reaching through a second class to get at the class members it uses.
Not Hungarian: Not only is Python not Hungarian (in general), but the PEP-8 specifically tells us to use longer, more descriptive variable names. hasInstrumentControllerPhaseDither is an example. Many variables are 15-20 characters and oftentimes longer.
This appears to be false. A quick check of the standard library finds between 1 and 2 percent of variable references to have fewer than 15 characters, rising to 8 percent of unique names. This hardly qualified as many.
do you mean greater than 15 characters? If not, then I don't see your point. At any rate, 8 percent of unique names seems statistically relevant. 8 percent of how many? If the number is 100,000, then I would say that 8,000 variable names qualifies as "many".
Whitespace: Python is very unique in that it *uses* whitespace for code blocking. It turns out to be very useful, since it visually cues the reader where code blocks begin and end by mandate. This creates many situations where code *starts* at the 10th indentation (40 characters in our standard, 80 characters in some Python standards).
This also appears to be false - the standard library has fewer than 200 lines out of over 80,000 that start that deep. "Rarely" would seem to be more accurate than "many".
200 lines might qualify as many. Regardless, there are quite a number of lines which do. Let's not argue over the meaning of "many". As long as "some" exist, the point remains.
Even in normal "great design" mode (which takes more time again), you can't help it....your code starts at the 6th indentation level often. (28 characters, more than 30% of 80 characters already gone. Now how many variables or class names can you fit?) Whitespace (2): Because Python uses whitespace as its sole method of code blocking and because this is the visual cue for code blocks, wrapping lines obfuscates this and makes the reader think about whether this whitespace was there for a code block, or a line-wrap. Thinking about intention of code slows us down.
2) Many of the libraries that are widely used do not adhere to the 80 character width line standard. wxPython, NumPy and Boa Constructor are a few, but I'm sure there are many, many more.
Which just goes to show that you don't hve to be constrained by the PEP if you don't want to. Notvthat doing so is a good idea.
3) Writing new code in 80 character line widths takes more time. If I have to worry about hitting this width, I have to de-concentrate my efforts of writing logical code and concentrate instead on how exactly to format this line of code (where to break it, etc....there are a number of rules attached to wrapping lines of code). Then I have to re-concentrate on the actual task at hand. Alternatively, I can code it up without worrying, then when convenient, take some time to reformat to 80 character width. Either way, more time.
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results. I find that if I'm running into the 80 character limit with any frequency, it's because my code is poorly structured and in need of a reworking.
That can definitely be a symptom of bad code. Doesn't mean it's the only reason for it, however.
4) Code searching. IDEs have powerful searching features. They list all the lines of a file (or files) which match the string you are searching for. If things are in one line, this search is meaningful and you can read it like you can code. If a line of code actually spans two (or more) lines of code, the search is no longer contextually useful and you have to click on each item to see what's actually going on. This feature is used heavily in many environments (especially large code bases) to save time, but time is either lost finding the actual context of a found string, or the search tool is avoided altogether because it does not provide meaningful results (i.e. a predictive waste of time)
In that case, you need a better IDE; it should either show more context, allow multiline searches, or both. Both of these things will help even if you never wrap your lines.
5) Monitors are getting bigger, wider, cheaper. This allows us to have two files, side-by-side on a screen that are *both* 120 character width (or even wider), without changing font sizes.
Printers aren't.
6) Tools are cheap. Time isn't. Get a second monitor, get a more powerful editor, do whatever it takes to save time. If viewing more information at one time is important, then we should try to make that possible with technology, not time.
This point seems to be at best neutral to where or if you wrap lines. A good IDE will wrap the display and indicate it did so.
7) Python is designed to be written more like English than other programming languages. English is written horizontally, not vertically. In furtherance to an attempt to make "readability" an objective argument, here is a scientific study which finds that greater character width lines improve readability: <http://psychology.wichita.edu/surl/usabilitynews/72/LineLength.asp> http://psychology.wichita.edu/surl/usabilitynews/72/LineLength.asp. To summarize, the study found that of the choices of 35, 55, 75 and 95 character lengths, 95 was able to be read the fastest.
Please note that the study *started* by pointing out that other studies existed which found the best line length to be anywhere from 35 to 85 characters, and gave no reason for trusting their results rather than the earlier studies. I would claim that all studies that looked at written languages - as opposed to programs - were inapplicable. They almost certainly used variable width fonts and a left justification (if not both), either of which has more effect on readability than line length.
Thanks for the time spent reading this long-ish post. Thanks for your feedback if you provide it.
You really need better justification - studies of program comprehension, not English reading speed, code bases that support your claims about written code, etc.
I would love to see these studies done, but at this time I cannot find them. The closest I could come (and I disclaimed that it was only a loose connection) was the study I referenced.
<mike
![](https://secure.gravatar.com/avatar/a5ddabe405f124dd6654b27357666da3.jpg?s=120&d=mm&r=g)
On Tue, 19 May 2009 15:36:35 -0700 Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
On Tue, May 19, 2009 at 10:51 AM, Mike Meyer <mwm@mired.org> wrote:
On May 19, 2009, at 12:43, Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
I realize that this is a religious debate which has gone on for many centuries. I appeal to the scientific aspects, with a distinct avoidance of preference and emotion. Preference might be easily explained by "right brain" vs "left brain" preference, but either way, it is merely a preference and I want to stick to facts. Here is a list I have compiled of facts which support a wider than 80 character line width standard (in Python, specifically). Please add to them, subtract from them, or add to the other side of the debate, but please avoid the usage of the word "readable" (which implies preference), unless you are referring to a scientific study of readability, which would be great.
Most of you points don't stand up under inestigation. Of course, negating them doesn't support am 80 character limit by itself.
1) Python is three things which the standard was not designed for: One: Object Oriented. Two: Not Hungarian notation Three: Mandatorily uses *whitespace* as its defintion for code blocks. Let me explain each one in a bit more detail: Object Oriented: Because it is not functional-style programming, but instead OO, you have to give defintion as to what object type you are using before using it. This makes definitions and usage longer than in functional programming (when 80 character widths were invented). PhazeMonkey.Hardware.FrameSource.Abstract.Framegrabber is an example (and not an extreme one) of a class (55 characters already) in a rather large code base.
This type of reference is considered by some to be bad style. See the "Law of Demeter" for more information.
The Law of Demeter applies to objects referenced second-hand. The class name given is an example of a hierarchy of modules, not one class reaching through a second class to get at the class members it uses.
In python, modules *are* objects, so this is still a case of one object reaching through a second object to get at members of that object. More to the point, Python allows you to use "from <module> import <names>", "import <name> as ..." and even "from <module> import <name> as ...", all of which provide much saner ways of dealing with deep trees of modules.
Not Hungarian: Not only is Python not Hungarian (in general), but the PEP-8 specifically tells us to use longer, more descriptive variable names. hasInstrumentControllerPhaseDither is an example. Many variables are 15-20 characters and oftentimes longer. This appears to be false. A quick check of the standard library finds between 1 and 2 percent of variable references to have fewer than 15 characters, rising to 8 percent of unique names. This hardly qualified as many. do you mean greater than 15 characters? If not, then I don't see your point. At any rate, 8 percent of unique names seems statistically relevant. 8 percent of how many? If the number is 100,000, then I would say that 8,000 variable names qualifies as "many".
Yes, I meant greater. And if you make your program small enough, 8,000 will be many. Then again, nearly any number qualifies as "many" if you're innumerate. The point is that if you want to use existing code to provide a reason for changing things, you should be making measurements of an existing code base rather than making vague claims about how common such things are. Since PEP 8 is really only enforced for the standard library, that's a good place to start.
Whitespace: Python is very unique in that it *uses* whitespace for code blocking. It turns out to be very useful, since it visually cues the reader where code blocks begin and end by mandate. This creates many situations where code *starts* at the 10th indentation (40 characters in our standard, 80 characters in some Python standards). This also appears to be false - the standard library has fewer than 200 lines out of over 80,000 that start that deep. "Rarely" would seem to be more accurate than "many". 200 lines might qualify as many. Regardless, there are quite a number of lines which do. Let's not argue over the meaning of "many". As long as "some" exist, the point remains.
Yes, but the quantity matters. Changing things will result in pain for some users - that's part of why they don't change. If you want to use a problem with the way things are that causes pain to justify making a change, you need to show that it occurs frequently enough that the pain it's causing outweighs the pain that would be caused by the alternative and adopting the change.
3) Writing new code in 80 character line widths takes more time. If I have to worry about hitting this width, I have to de-concentrate my efforts of writing logical code and concentrate instead on how exactly to format this line of code (where to break it, etc....there are a number of rules attached to wrapping lines of code). Then I have to re-concentrate on the actual task at hand. Alternatively, I can code it up without worrying, then when convenient, take some time to reformat to 80 character width. Either way, more time.
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results. I find that if I'm running into the 80 character limit with any frequency, it's because my code is poorly structured and in need of a reworking. That can definitely be a symptom of bad code. Doesn't mean it's the only reason for it, however.
The number of reasons is irrelevant: you claim that writing to an 80-character limit slows you down, others claim that it causes them to write better code, so this matter also calls for more investigation. Are there studies that deal with either otherwise apocryphal claim? In particular, how many people are affected by each.
You really need better justification - studies of program comprehension, not English reading speed, code bases that support your claims about written code, etc. I would love to see these studies done, but at this time I cannot find them.
That's sort of the point. You're making claims with few or no quantitative values to attach to them. I.e. "many variables use long names, and you can't get a lot of those on a line." A few minutes with grep & wc on the standard library suggest to me that this isn't the case. My "study" was admittedly unscientific and inaccurate - back of the envelope type stuff, but it's still more rigorous than what you started with. If you want to put your arguments on a scientific basis, you'll do some analysis of a real code base to provide quantities to back things up. One final addition - while it's true that desktop monitors are getting wider and cheaper, it's also true that we're moving to a world where people regularly work on things that aren't desktops. Laptops are getting cheaper and *smaller* as well as larger, and there are good reasons for not wanting a 17" - or even a 15" - laptop, which means you're back to screens the size of the old glass TTYs (which I fondly remember as light blue on dark blue, at least in the last iteration). Netbooks take us even smaller - and people have been putting development systems on PDAs for over a decade. So while developers now may have access to systems with multiple windows that are 120 or more characters wide, they also now use devices that have to stretch to their limits to display 80 columns across. Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org
![](https://secure.gravatar.com/avatar/24eea7eed7a51265f4f0c573b81f916d.jpg?s=120&d=mm&r=g)
Mike Meyer wrote:
On Tue, 19 May 2009 15:36:35 -0700 Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
On Tue, May 19, 2009 at 10:51 AM, Mike Meyer <mwm@mired.org> wrote:
On May 19, 2009, at 12:43, Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
I realize that this is a religious debate which has gone on for many centuries. I appeal to the scientific aspects, with a distinct avoidance of preference and emotion. Preference might be easily explained by "right brain" vs "left brain" preference, but either way, it is merely a preference and I want to stick to facts. Here is a list I have compiled of facts which support a wider than 80 character line width standard (in Python, specifically). Please add to them, subtract from them, or add to the other side of the debate, but please avoid the usage of the word "readable" (which implies preference), unless you are referring to a scientific study of readability, which would be great.
Most of you points don't stand up under inestigation. Of course, negating them doesn't support am 80 character limit by itself.
1) Python is three things which the standard was not designed for: One: Object Oriented. Two: Not Hungarian notation Three: Mandatorily uses *whitespace* as its defintion for code blocks. Let me explain each one in a bit more detail: Object Oriented: Because it is not functional-style programming, but instead OO, you have to give defintion as to what object type you are using before using it. This makes definitions and usage longer than in functional programming (when 80 character widths were invented). PhazeMonkey.Hardware.FrameSource.Abstract.Framegrabber is an example (and not an extreme one) of a class (55 characters already) in a rather large code base.
This type of reference is considered by some to be bad style. See the "Law of Demeter" for more information.
The Law of Demeter applies to objects referenced second-hand. The class name given is an example of a hierarchy of modules, not one class reaching through a second class to get at the class members it uses.
In python, modules *are* objects, so this is still a case of one object reaching through a second object to get at members of that object.
More to the point, Python allows you to use "from <module> import <names>", "import <name> as ..." and even "from <module> import <name> as ...", all of which provide much saner ways of dealing with deep trees of modules.
Not Hungarian: Not only is Python not Hungarian (in general), but the PEP-8 specifically tells us to use longer, more descriptive variable names. hasInstrumentControllerPhaseDither is an example. Many variables are 15-20 characters and oftentimes longer. This appears to be false. A quick check of the standard library finds between 1 and 2 percent of variable references to have fewer than 15 characters, rising to 8 percent of unique names. This hardly qualified as many. do you mean greater than 15 characters? If not, then I don't see your point. At any rate, 8 percent of unique names seems statistically relevant. 8 percent of how many? If the number is 100,000, then I would say that 8,000 variable names qualifies as "many".
Yes, I meant greater. And if you make your program small enough, 8,000 will be many. Then again, nearly any number qualifies as "many" if you're innumerate.
The point is that if you want to use existing code to provide a reason for changing things, you should be making measurements of an existing code base rather than making vague claims about how common such things are. Since PEP 8 is really only enforced for the standard library, that's a good place to start.
Whitespace: Python is very unique in that it *uses* whitespace for code blocking. It turns out to be very useful, since it visually cues the reader where code blocks begin and end by mandate. This creates many situations where code *starts* at the 10th indentation (40 characters in our standard, 80 characters in some Python standards). This also appears to be false - the standard library has fewer than 200 lines out of over 80,000 that start that deep. "Rarely" would seem to be more accurate than "many". 200 lines might qualify as many. Regardless, there are quite a number of lines which do. Let's not argue over the meaning of "many". As long as "some" exist, the point remains.
Yes, but the quantity matters. Changing things will result in pain for some users - that's part of why they don't change. If you want to use a problem with the way things are that causes pain to justify making a change, you need to show that it occurs frequently enough that the pain it's causing outweighs the pain that would be caused by the alternative and adopting the change.
3) Writing new code in 80 character line widths takes more time. If I have to worry about hitting this width, I have to de-concentrate my efforts of writing logical code and concentrate instead on how exactly to format this line of code (where to break it, etc....there are a number of rules attached to wrapping lines of code). Then I have to re-concentrate on the actual task at hand. Alternatively, I can code it up without worrying, then when convenient, take some time to reformat to 80 character width. Either way, more time.
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results. I find that if I'm running into the 80 character limit with any frequency, it's because my code is poorly structured and in need of a reworking. That can definitely be a symptom of bad code. Doesn't mean it's the only reason for it, however.
The number of reasons is irrelevant: you claim that writing to an 80-character limit slows you down, others claim that it causes them to write better code, so this matter also calls for more investigation. Are there studies that deal with either otherwise apocryphal claim? In particular, how many people are affected by each.
You really need better justification - studies of program comprehension, not English reading speed, code bases that support your claims about written code, etc. I would love to see these studies done, but at this time I cannot find them.
That's sort of the point. You're making claims with few or no quantitative values to attach to them. I.e. "many variables use long names, and you can't get a lot of those on a line." A few minutes with grep & wc on the standard library suggest to me that this isn't the case. My "study" was admittedly unscientific and inaccurate - back of the envelope type stuff, but it's still more rigorous than what you started with. If you want to put your arguments on a scientific basis, you'll do some analysis of a real code base to provide quantities to back things up.
One final addition - while it's true that desktop monitors are getting wider and cheaper, it's also true that we're moving to a world where people regularly work on things that aren't desktops. Laptops are getting cheaper and *smaller* as well as larger, and there are good reasons for not wanting a 17" - or even a 15" - laptop, which means you're back to screens the size of the old glass TTYs (which I fondly remember as light blue on dark blue, at least in the last iteration). Netbooks take us even smaller - and people have been putting development systems on PDAs for over a decade. So while developers now may have access to systems with multiple windows that are 120 or more characters wide, they also now use devices that have to stretch to their limits to display 80 columns across.
With screens in glasses and built-in mini-projectors you'll be able to have wide displays again even on PDAs. :-)
![](https://secure.gravatar.com/avatar/bb1c1ec8416811226dc543e4df77c395.jpg?s=120&d=mm&r=g)
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results. ??? How can you assert such a non-pertinent (and wrong) statement (in regard to the point beeing discussed)? Oulipo games are about helping *creativity*. A set of strict constraints helps the poet be creative -- or rather they let whatever unconsciounsly creates "popping" good mattery into conscious poetic minds. Denis
![](https://secure.gravatar.com/avatar/f9c4ab38a9ced1923ff1bf6e3553a029.jpg?s=120&d=mm&r=g)
spir <denis.spir@free.fr> wrote:
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results.
???
How can you assert such a non-pertinent (and wrong) statement (in regard to the point beeing discussed)? Oulipo games are about helping *creativity*. A set of strict constraints helps the poet be creative -- or rather they let whatever unconsciounsly creates "popping" good mattery into conscious poetic minds.
Do you think that writing code is an act that does not involve creativity? Writing with constraints to improve creativity applies to any kind of writing, not just poetry. I heard about the idea in my creative writing class; I've never heard of Oulipo before. It also has already been invoked in this thread with respect to writing programs: constrained to 80 characters, you are encouraged to refactor the code, thereby producing superior code. --David
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
spir wrote:
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results.
Oulipo games are about helping *creativity*.
The best explanation I've seen for this phenomenon is that it works by forcing you to avoid cliches. For example, if you have to fit your words into a fixed meter, you can't just use the first phrasing that comes into your head. You have to hunt around for alternative words that fit the pattern, and in the process you most likely come up with something original and surprising. I don't think this applies in the same way when you're writing a program. The goal there is *not* to be original and surprising -- if anything it's the opposite! You want to convey the meaning of the code to the reader as clearly as possible, and if it uses an idiom that the reader has seen before and can instantly recognise, then so much the better. -- Greg
![](https://secure.gravatar.com/avatar/144a61ea3d5d30f93e5c67e54fe51d63.jpg?s=120&d=mm&r=g)
Greg Ewing wrote:
spir wrote:
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results.
Oulipo games are about helping *creativity*.
I don't think this applies in the same way when you're writing a program. The goal there is *not* to be original and surprising -- if anything it's the opposite! You want to convey the meaning of the code to the reader as clearly as possible, and if it uses an idiom that the reader has seen before and can instantly recognise, then so much the better.
And therein, you have defeated your own argument. Given the large body of code that already follows PEP8 (and other style guides for other languages that commonly use an 80-character boundary), it is a common constraint which yields many common idioms (such as placing list items on separate lines with similar indention). The readers wished hard For the thread to die quickly Sadly it goes on -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
Scott Dial wrote:
Given the large body of code that already follows PEP8 (and other style guides for other languages that commonly use an 80-character boundary), it is a common constraint which yields many common idioms (such as placing list items on separate lines with similar indention).
I still don't think it's the same thing. Limiting lines to 80 chars or thereabouts is not an *arbitrary* constraint. There are intrinsic merits to it, e.g. people find very long lines hard to read, you can fit multiple windows side by side, etc. On the other hand, there's nothing inherently virtuous about writing in iambic pentameter or avoiding the use of the letter "e". The only merit of such constraints is that they push you *away* from a very small area of badness (i.e. cliches) and out into a much bigger area of non-badness (any non-cliched way of saying the same thing). I don't think you can do the same thing with programming. You can't get a good program just by avoiding bad things, you have to actively aim for the good things. -- Greg
![](https://secure.gravatar.com/avatar/334b870d5b26878a79b2dc4cfcc500bc.jpg?s=120&d=mm&r=g)
Greg Ewing writes:
I don't think you can do the same thing with programming. You can't get a good program just by avoiding bad things, you have to actively aim for the good things.
You think you can get a good poem simply by avoiding bad things? Surely not! There are still an infinite number of ways to write a bad haiku, despite the extreme style constraint, and great haiku writers remain rare. I think the analogy programming to poetry is quite strong. And in a discussion of Python, there should be no question that an arbitrary line-length limitation aims at the good things: "flat is better than nested". The question here is, how accurate is that aim? Does it too often hit our feet instead?
From the examples presented here (both actual and contrived!) by the *opponents* of line-length limitation, it seems to me that being strict about an 80-character limit is not "a foolish consistency", but rather a quite cheap and accurate way to identify flat-is-better-than- nested violations. The time spent *in implementing Python and the stdlib* on avoiding and fixing those is time very well spent, IMO. Other projects *should* make their own judgments.
There are two specific usages that I think may deserve exceptions, because they don't involve flat-is-better-than-nested violations. The first is line-terminating comments. These are *good* style IMHO in cases where a particular statement requires glossing. I don't see how to reformat those to meet line-length limitations except by moving them above the glossed statement, which is ambiguous (how many statements are in the "scope" of the comment?) and, worse, interrupts the flow of control when reading the code. The second is strings for presentation in a line-oriented terminal context, such as error messages and docstrings. I like these to be about 65-70 characters wide, but there are some cases (eg, first lines of docstrings) where that is regularly desirable to violate. Here in practice I use Ben Finney's "indent one extra level" and Jim Jewett's "escape newline and start string at left margin" techniques, but I'm not 100% satisfied with them.
participants (9)
-
Aaron Rubin
-
Greg Ewing
-
Mike Meyer
-
MRAB
-
R. David Murray
-
Raymond Hettinger
-
Scott Dial
-
spir
-
Stephen J. Turnbull