[Tutor] Long Lines techniques

Thu Dec 13 23:07:59 EST 2018

<[{SYNOPSIS: Many good answers. I am satisfied and we can move on.}]>

Steven,

I appreciate the many useful suggestions.

Many of them are what I already do. Some are in tension with other
considerations. Yes, it can be shorter and more efficient to not keep saying
module.this.that.something versus something like:

>From module.this.that import something as myname.

Of course you do that with care as you want to be careful about pulling too
many things into collisions in one namespace. Longer more descriptive names
are encouraged.

Based on reading quite a bit of code lately, I do see how common it is to
try to shorten names while not polluting the namespace as in the nearly
universal:

Import numpy as np, pandas as pd

The places I like to wrap lines tend to be, in reality, the places python
tolerates it. If you use a function that lets you set many options, it is
nice to see the options one per line. Since the entire argument list is in
parentheses, that works. Ditto for creating lists, sets and dictionaries
with MANY items at once. 

There are cases where it may make sense to have a long like connected by AND
or OR given how python does short-circuiting while returning the last thing
or two it touched instead of an actual True/False. For example, you may want
to take the first available queue that is not empty with something like
this:

Using = A or B or C or ... or Z
Handling = Using.pop()

Sure, that could be rewritten into multiple lines. 

I won't get sucked into a PERL discussion except to say that some people
love to write somethings so obscure they won't recognize it even a daylater.
PERL makes that very easy. I have done that myself a few times as I was an
early user. Python may claim to be straightforward but I can easily see ways
to fool people in python too with dunder methods or function closures or
decorators or ...

All in all, I think my question has been answered. I will add one more
concept.

I recently wrote some code and ran into error messages on lines I was trying
to keep short:

A = 'text"
A += "more text"
A += object
A+= ...

At one point, I decided to use a formatted string instead:

A = f"...{...}...{...}..."

Between curly braces I could insert variables holding various strings. As
long as those names were not long, and with some overhead, the line of code
was of reasonable size even if it expanded to much more.

-----Original Message-----
From: Tutor <tutor-bounces+avigross=verizon.net at python.org> On Behalf Of
Steven D'Aprano
Sent: Thursday, December 13, 2018 7:27 PM
To: tutor at python.org
Subject: Re: [Tutor] Long Lines techniques

On Thu, Dec 13, 2018 at 12:36:27PM -0500, Avi Gross wrote:

> Simple question:
> 
> When lines get long, what points does splitting them make sense and 
> what methods are preferred?

Good question!

First, some background:

Long lines are a potential code smell: a possible sign of excessively terse
code. A long line may be a sign that you're doing too much in one line.

https://martinfowler.com/bliki/CodeSmell.html
http://wiki.c2.com/?CodeSmell
https://blog.codinghorror.com/code-smells/

Related: 
https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/

Note that merely splitting a logical line over two or more physical lines
may still be a code-smell. Sure, your eyes don't get as tired reading
fifteen lines of 50 characters each, compared to a single 750 character
line, but there's just as much processing going on in what is essentially a
single operation.

Long lines are harder to read: your eyes have to scan across a long line,
and beyond 60 or 70 characters, it becomes physically more difficult to scan
across the line, and the error rate increases. 
[Citation required.]

But short lines don't include enough information, so the traditional
compromise is 80 characters, the character width of the old-school
green-screen terminals. The Python standard library uses 79 characters. 
(The odd number is to allow for scripts which count the newline at the end
of the line as one of the 80.)

https://www.python.org/dev/peps/pep-0008/

Okay, so we have a style-guide that sets a maximum line length, whether it
is 72 or 79 or 90 or 100 characters. What do you do when a line exceeds that
length?

The only firm rule is that you must treat each case on its own merits. 
There is no one right or wrong answer. Every long line of code is different,
and the solution will depend on the line itself. There is no getting away
from human judgement.

(1) Long names. Do you really need to call the variable
"number_of_characters" when "numchars" or even "n" would do?

The same applies to long function names: "get_data_from_database" is
probably redundant, "get_data" will probably do.

Especially watch out for long dotted names that you use over and over again.
Unlike static languages like Java, each dot represents a runtime lookup.
Long names like:

    package.subpackage.module.object.method

requires four lookups. Look for oportunities to make an alias for a long
name and avoid long chains of dots:

    for item in sequence:
        do_something_with(package.subpackage.module.object.method(arg,
item))

can be refactored to:

    method = package.subpackage.module.object.method
    for item in sequence:
        do_something_with(method(arg, item))

and is both easier to read and more efficient. A double win!

(2) Temporary constants: sometimes it is good enough to just introduce a
simple named constant used once. The cognitive load is low if it is defined
immediately before it is used. Instead of the long line:

    raise ValueError("expected a list, string, dict or None, but instead got
'%s'" % type(value).__name__)

I write:

    errmsg = "expected a list, string, dict or None, but instead got '%s'"
    raise ValueError(errmsg % type(value).__name__)

(3) Code refactoring. Maybe that long line is sign that you need to add a
method or function? Especially if you are using that line, or similar, in
multiple places. But refactoring is justified even if you use the line
*once* if it is complicated enough.

Likewise, sometimes it is helpful to factor out separate sub-expressions
onto their own lines, using their own variables, rather than doing
everything in a single, complicated, expression.

Psychologists, educators and linguists call this "chunking", and it is often
very helpful for simplifying complicated ideas, sentences and expressions.

The lack of chunks is why long Perl one-liners are so inpenetrable.

(4) Split the long logical line over multiple physical lines. This does
nothing to reduce the inherent complexity of the line, but if that's fairly
low to start with, it is often helpful.

Python gives us two ways to split a logical line over multiple physical
lines: a backslash at the end of the line, and brackets of any sort.

The preferred way is to use round brackets for grouping:

    result = (some very long expression
              which can be split over
              many lines)

This is especially useful with function calls:

    result = function(first_argument, second_argument,
                      third_argument, fourth_argument)

If you are building a list or dict literal, there is no need for the
parentheses, as square and curly brackets have the same effect. That's
especially useful with two-dimensional nested lists:

    data = [[row, one, with, many, items],
            [row, two, with, many, items],
            [row, three, with, many, items]]

For long strings, I like to use *implicit string concatentation*. String
literals which are separated by nothing except whitespace are concatenated
at compile-time. So I can write a long string like this:

    long_string = ("this is a very long string which doesn't"
                   " fit on a single line but isn't appropriate"
                   " for a triple-quoted string")

Notice that I split the string at word breaks, and move the space to the
beginning of the physical line rather than the end. I find that I'm less
likely to forget the space if I put it at the start of the line rather than
the end.

Not preferred, but allowed for backwards compatibility and still very
occasionally useful, is to end the line with a bare backslash. I find it
helpful in conjunction with triple quoted strings:

    text = """\
    body of the string
    is aligned
    including the first line
    """

but otherwise the backslash is problematic and error-prone. It must be
*immediately* followed by a newline, if you accidentally add a space after
the backslash it won't work.

And finally:

(5) Its just a style guide, not a law of physics. As Douglas Bader once
said, "Rules are for the guidance of the wise and the obedience of fools."
See also Raymond Hettinger's talk "Beyond PEP 8":

https://twitter.com/raymondh/status/589849947408703488

https://medium.com/@drb/pep-8-beautiful-code-and-the-tyranny-of-guidelines-f
96499f5ac17

Better to go two or three characters beyond the maximum length than to make
the code ugly.

[...]
> There are places you can break lines as in a comprehension such as this
set
> comprehension:
> 
>     letter_set = { letter
>                    for word in (left_list + right_list)
>                    for letter in word }
> 
> The above is an example where I know I can break because the {} is holding
> it together. I know I can break at each "for" or "if" but can I break at
> random places?

Not quite random, you can't break in the middle of a word, but you 
can break between words.

[...]
> I will stop here with saying that unlike many languages, parentheses must
be
> used with care in python as they may create a tuple or even generator
> expression.

But not by accident. You can't create a generator expression by accident 
by wrapping an arbitrary expression in round brackets, or turn a 
expression into a tuple. 

Remember, it isn't the parentheses which make tuples, its the commas. 
Except for the empty tuple special case, (), the parens are ALWAYS 
just there to either group the tuple so as to avoid ambiguity, or to 
visually emphasize that it is a tuple even if the interpreter doesn't 
need the hint.

-- 
Steve
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor