[Tutor] deleting one line in multiple files

bhaaluu bhaaluu at gmail.com
Fri Sep 14 15:09:15 CEST 2007


Okay, I've had a chance to experiment with both solutions, and
both of them work as advertised. For those just tuning in, what
I wanted to do was delete a line in each file in a directory that
had a specific pattern in the line. There were over two hundred
files in the directory, and I quickly discovered that deleting the
line in each file, manually, was going to take quite a while. I'm
learning Python, so I set out to find a Python solution. I stumbled
upon the 'fileinput' module, specifically, a snippet that took a
filename and deleted the line 'in place' (no need for a temporary
third file). A blank line was left where the deleted line was, but
that was not a problem in this case.

The most flexible script is the one that uses a list of filenames, that
need to be worked on, in the directory. In this case, it doesn't matter
if the files are incrementaly numbered or not. I like this solution the
best because it is the most versatile.

1. Create the list of files that need work:
    $ ls > file.list

2. Run the Python script:
    $ python delete2.py

That's it! =)

Here's the script:

# delete2.py
# 2007-09-14
# the lines with '<script type' are deleted.
import fileinput

filenames = open("file.list").read().splitlines()
for line in fileinput.input(filenames, inplace=1):
        line = line.strip()
        if not '<script type' in line:
            print line
# end delete2.py

I've already posted the other solution, which iterates through incrementally
numbered files, but for the purpose of Summary, here it is again:

# delete1.py
# 2007-09-14
# the lines with '<script type' are deleted.
import fileinput

for i in range(1, 176):
    filename = 'filename%04d.html' % i
    for line in fileinput.input(filename, inplace=1):
        line = line.strip()
        if not '<script type' in line:
            print line
# end delete1.py

In this case, a list of files is not used since all the files have the same
name + an incremented number, and the same extension. All the
files were named: filename0001.html to filename0207.html. I was able
to work on a subset of the files within the stated range. Also useful! =)

Finally, here is the original snippet, (I wish I'd noted where I found it!):

# delete0.py
# 2007-09-14
# open 00.test and the lines with '<script type' are deleted.
import fileinput
for line in fileinput.input("00.test", inplace=1):
    line = line.strip()
    if not '<script type'in line:
        print line
# end delete0.py

This takes one file (00.test) and deletes the lines with '<script type' in them.

Finally, here is the Python documentation reference for 'fileinput'

I've already placed these little scripts in my utility directory, as I'm sure
to use them again.

Many thanks to all the Tutors (esp. wormwood_3 and Kent).

Happy Programming!
bhaaluu at gmail dot com

More information about the Tutor mailing list