
On Tue, 19 May 2009 15:36:35 -0700 Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
On Tue, May 19, 2009 at 10:51 AM, Mike Meyer <mwm@mired.org> wrote:
On May 19, 2009, at 12:43, Aaron Rubin <aaron.rubin@4dtechnology.com> wrote:
I realize that this is a religious debate which has gone on for many centuries. I appeal to the scientific aspects, with a distinct avoidance of preference and emotion. Preference might be easily explained by "right brain" vs "left brain" preference, but either way, it is merely a preference and I want to stick to facts. Here is a list I have compiled of facts which support a wider than 80 character line width standard (in Python, specifically). Please add to them, subtract from them, or add to the other side of the debate, but please avoid the usage of the word "readable" (which implies preference), unless you are referring to a scientific study of readability, which would be great.
Most of you points don't stand up under inestigation. Of course, negating them doesn't support am 80 character limit by itself.
1) Python is three things which the standard was not designed for: One: Object Oriented. Two: Not Hungarian notation Three: Mandatorily uses *whitespace* as its defintion for code blocks. Let me explain each one in a bit more detail: Object Oriented: Because it is not functional-style programming, but instead OO, you have to give defintion as to what object type you are using before using it. This makes definitions and usage longer than in functional programming (when 80 character widths were invented). PhazeMonkey.Hardware.FrameSource.Abstract.Framegrabber is an example (and not an extreme one) of a class (55 characters already) in a rather large code base.
This type of reference is considered by some to be bad style. See the "Law of Demeter" for more information.
The Law of Demeter applies to objects referenced second-hand. The class name given is an example of a hierarchy of modules, not one class reaching through a second class to get at the class members it uses.
In python, modules *are* objects, so this is still a case of one object reaching through a second object to get at members of that object. More to the point, Python allows you to use "from <module> import <names>", "import <name> as ..." and even "from <module> import <name> as ...", all of which provide much saner ways of dealing with deep trees of modules.
Not Hungarian: Not only is Python not Hungarian (in general), but the PEP-8 specifically tells us to use longer, more descriptive variable names. hasInstrumentControllerPhaseDither is an example. Many variables are 15-20 characters and oftentimes longer. This appears to be false. A quick check of the standard library finds between 1 and 2 percent of variable references to have fewer than 15 characters, rising to 8 percent of unique names. This hardly qualified as many. do you mean greater than 15 characters? If not, then I don't see your point. At any rate, 8 percent of unique names seems statistically relevant. 8 percent of how many? If the number is 100,000, then I would say that 8,000 variable names qualifies as "many".
Yes, I meant greater. And if you make your program small enough, 8,000 will be many. Then again, nearly any number qualifies as "many" if you're innumerate. The point is that if you want to use existing code to provide a reason for changing things, you should be making measurements of an existing code base rather than making vague claims about how common such things are. Since PEP 8 is really only enforced for the standard library, that's a good place to start.
Whitespace: Python is very unique in that it *uses* whitespace for code blocking. It turns out to be very useful, since it visually cues the reader where code blocks begin and end by mandate. This creates many situations where code *starts* at the 10th indentation (40 characters in our standard, 80 characters in some Python standards). This also appears to be false - the standard library has fewer than 200 lines out of over 80,000 that start that deep. "Rarely" would seem to be more accurate than "many". 200 lines might qualify as many. Regardless, there are quite a number of lines which do. Let's not argue over the meaning of "many". As long as "some" exist, the point remains.
Yes, but the quantity matters. Changing things will result in pain for some users - that's part of why they don't change. If you want to use a problem with the way things are that causes pain to justify making a change, you need to show that it occurs frequently enough that the pain it's causing outweighs the pain that would be caused by the alternative and adopting the change.
3) Writing new code in 80 character line widths takes more time. If I have to worry about hitting this width, I have to de-concentrate my efforts of writing logical code and concentrate instead on how exactly to format this line of code (where to break it, etc....there are a number of rules attached to wrapping lines of code). Then I have to re-concentrate on the actual task at hand. Alternatively, I can code it up without worrying, then when convenient, take some time to reformat to 80 character width. Either way, more time.
On the other side, the Oulipo school of writing believes that writing with apparently arbitrary constraints improves the results. I find that if I'm running into the 80 character limit with any frequency, it's because my code is poorly structured and in need of a reworking. That can definitely be a symptom of bad code. Doesn't mean it's the only reason for it, however.
The number of reasons is irrelevant: you claim that writing to an 80-character limit slows you down, others claim that it causes them to write better code, so this matter also calls for more investigation. Are there studies that deal with either otherwise apocryphal claim? In particular, how many people are affected by each.
You really need better justification - studies of program comprehension, not English reading speed, code bases that support your claims about written code, etc. I would love to see these studies done, but at this time I cannot find them.
That's sort of the point. You're making claims with few or no quantitative values to attach to them. I.e. "many variables use long names, and you can't get a lot of those on a line." A few minutes with grep & wc on the standard library suggest to me that this isn't the case. My "study" was admittedly unscientific and inaccurate - back of the envelope type stuff, but it's still more rigorous than what you started with. If you want to put your arguments on a scientific basis, you'll do some analysis of a real code base to provide quantities to back things up. One final addition - while it's true that desktop monitors are getting wider and cheaper, it's also true that we're moving to a world where people regularly work on things that aren't desktops. Laptops are getting cheaper and *smaller* as well as larger, and there are good reasons for not wanting a 17" - or even a 15" - laptop, which means you're back to screens the size of the old glass TTYs (which I fondly remember as light blue on dark blue, at least in the last iteration). Netbooks take us even smaller - and people have been putting development systems on PDAs for over a decade. So while developers now may have access to systems with multiple windows that are 120 or more characters wide, they also now use devices that have to stretch to their limits to display 80 columns across. Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org