PEP 8 and extraneous whitespace

Chris Angelico rosuav at gmail.com
Fri Jul 29 17:18:53 EDT 2011


On Sat, Jul 30, 2011 at 4:45 AM, OKB (not okblacke)
<brenNOSPAMbarn at nobrenspambarn.net> wrote:
> Chris Angelico wrote:
>> That mandates that formatting NOT be a part of the language. I could
>> take C code and reformat it in various ways with a script, and easily
>> guarantee that the script won't affect the code by simply having it
>> only ever adjust whitespace. This concept simply won't work in Python.
>
>        "Formatting" is too general a term.  Obviously "formatting" in
> the lay sense is part of any language, since things like commas and
> parentheses matter.  What Python does that's unusual is pay attention to
> INDENTATION, which is a very specific part of formatting involving
> whitespace at the beginning of a line.  This isn't incompatible with
> what I outlined above.

My notion of "formatting" of code is entirely whitespace, but it
includes such things as:

-----
int
declspec(DLLEXPORT) foofunc(
    char *str,
    int num
) {
  printf("%s: %d",
    str,num);
  return 1;
}
-----
int declspec(DLLEXPORT) foofunc(char *str,int num)
{
    printf("%s: %d", str, num);
    return 1;
}
-----

There's a lot of difference between these two, in terms of
readability, but it wouldn't be hard to write a script that could
format the code into your preferred form. It would be far stricter
than I would like to use, but if you have a gigantic mess of code and
you want to start comprehending it (think IOCCC entries), these sorts
of code reformatters are excellent.

>        I don't think I understand what you're getting at there.  What
> would you mean "handling" those string literals?

What I mean is that these string literals are broken across multiple
lines in a way that is meaningful to the human. Doing them with
triple-quoted strings and actual newlines is potentially restrictive;
if you're trying to build up a string of strings (like for a DOS
environment block - each NAME=VALUE pair is followed by a \0 and the
last one by another \0) that could contain newlines, it's better not
to have actual newlines in the string. Of course, there are other
options (make a list of strings, then "\0".join(lst) to make your
final string), but there are times when it's cleanest to simply have a
string literal that consists of multiple lines of source code,
irrespective of newlines in the string.

>        So I guess I was a bit unclear when I said "the editor should
> present the code nicely".  I just mean that the editor should present
> the code in a way that reflects the semantic indentation present, thus
> relieving users of the need to insert redundant linebreaks and
> whitespace to make things "line up".  I don't mean that it should try to
> "prettify" things in the way you seem to be suggesting.

Okay, so you're talking about something a lot narrower than I thought
you were. That's fine then; indentation is much simpler and easier to
manage!

>        As an aside, I don't think string literals are a great example case
> for these issues.  ALL whitespace (not just indentation), and all
> of everything else, always matters in string literals, so you do indeed
> have to put in explicit whitespace if you want explicit whitespace in
> the string.

That's precisely why abutting is supported for literal concatenation:
>>> "foo" "bar"
'foobar'

>>> ("foo"
"bar")
'foobar'

Suppose you want to have a lengthy piece of text embedded in your
source code, which you will then pass to a GUI widget for display. You
want the widget to handle word wrap, so you don't want internal line
breaks getting in the way, but you want the source code to be wrapped.
What's the best way to do this? I've never been a fan of string
literals needing run-time adjustments (like stripping indentation from
triple-quoted strings - it usually depends on the indent on the second
line, and what if you want that second line to be differently indented
from the others?), so to my mind the cleanest way would be abuttal:

    "Lorem ipsum dolor sit amet, consectetur "
    "adipiscing elit. Fusce fermentum posuere "
    "mi eget molestie. Nulla facilisi. Curabitur "
    "et ultrices massa."

The code's indented four spaces, but I don't have to strip them from
the string. (Depending on context, I might need to put parens around
that to force it to be parsed as a single string. Small potatoes.)

  In code, though, explicit whitespace is only needed to
> indicate semantically meaningful stuff, so you should only use it for
> that, and the editor should insert "visual space" (without actual
> whitespace characters in the file) to make things like up at the
> semantic indentation levels.

Interesting concept. This suggests that your source file does not
actually have the indentation, but the display does. This is an
interesting way of dealing with triple-quoted strings.

    foo = """This is line 1.
This is line 2.
Line 3."""
    print(foo)

If that displayed indented, maybe with a faint vertical line showing
the string's left margin, that would greatly improve readability. Any
editor that doesn't support this feature would simply see it flush
left, slightly ugly but workable. Very nice idea... but I don't feel
like implementing it. :)

ChrisA



More information about the Python-list mailing list