[Tutor] question about strip() and list comprehension

Steven D'Aprano steve at pearwood.info
Wed Apr 9 00:50:07 CEST 2014

On Tue, Apr 08, 2014 at 02:38:13PM -0600, Jared Nielsen wrote:
> Hello,
> Could someone explain why and how this list comprehension with strip()
> works?
> f = open('file.txt')
> t = [t for t in f.readlines() if t.strip()]
> f.close()
> print "".join(t)
> I had a very long file of strings filled with blank lines I wanted to
> remove. I did some Googling and found the above code snippet, but no clear
> explanation as to why it works. I'm particularly confused by how "if
> t.strip()" is removing the blank lines. 

It isn't. Rather, what it is doing is *preserving* the non-blank lines.

The call to strip() removes any leading and trailing whitespace, so if 
the line is blank of contains nothing but whitespace, it reduces down to 
the empty string:

py> '    '.strip()

Like other empty sequences and containers, the empty string is 
considered to be "like False", falsey:

py> bool('')

So your list cmprehension (re-written to use a more meaningful name) 
which looks like this:

    [line for line in f.readlines() if line.strip()

iterates over each line in the file, tests if there is anything left 
over after stripping the leading/trailing whitespace, and only 
accumulates the lines that are non-blank. It is equivalent to this 

    accumulator = []
    for line in f.readlines():
        if line.strip():  # like "if bool(line.strip())"

> I also don't fully understand the 'print "".join(t)'.

I presume you understand what print does :-) so it's only the "".join(t) 
that has you confused. This is where the interactive interpreter is 
brilliant, you can try things out for yourself and see what they do. Do 
you know how to start the interactive interpreter?

(If not, ask and we'll tell you.)

py> t = ['Is', 'this', 'the', 'right', 'place', 'for', 'an', 'argument?']
py> ''.join(t)
py> ' '.join(t)
'Is this the right place for an argument?'
py> '--+--'.join(t)

In your case, you have a series of lines, so each line will end with a 

py> t = ['line 1\n', 'line 2\n', 'line 3\n']
py> ''.join(t)
'line 1\nline 2\nline 3\n'
py> print ''.join(t)
line 1
line 2
line 3

> The above didn't remove the leading white space on several lines, so I made
> the following addition:
> f = open('file.txt')
> t = [t for t in f.readlines() if t.strip()]
> f.close()
> s = [x.lstrip() for x in t]
> print "".join(s)

You can combine those two list comps into a single one:

f = open('file.txt')
lines = [line.lstrip() for line in f.readlines() if line.strip()]


More information about the Tutor mailing list