[Tutor] deleting one line in multiple files

bhaaluu bhaaluu at gmail.com
Fri Sep 14 15:09:15 CEST 2007


Greetings,

Okay, I've had a chance to experiment with both solutions, and
both of them work as advertised. For those just tuning in, what
I wanted to do was delete a line in each file in a directory that
had a specific pattern in the line. There were over two hundred
files in the directory, and I quickly discovered that deleting the
line in each file, manually, was going to take quite a while. I'm
learning Python, so I set out to find a Python solution. I stumbled
upon the 'fileinput' module, specifically, a snippet that took a
filename and deleted the line 'in place' (no need for a temporary
third file). A blank line was left where the deleted line was, but
that was not a problem in this case.

The most flexible script is the one that uses a list of filenames, that
need to be worked on, in the directory. In this case, it doesn't matter
if the files are incrementaly numbered or not. I like this solution the
best because it is the most versatile.

1. Create the list of files that need work:
    $ ls > file.list

2. Run the Python script:
    $ python delete2.py

That's it! =)

Here's the script:

# delete2.py
# 2007-09-14
# the lines with '<script type' are deleted.
import fileinput

filenames = open("file.list").read().splitlines()
for line in fileinput.input(filenames, inplace=1):
        line = line.strip()
        if not '<script type' in line:
            print line
# end delete2.py

I've already posted the other solution, which iterates through incrementally
numbered files, but for the purpose of Summary, here it is again:

# delete1.py
# 2007-09-14
# the lines with '<script type' are deleted.
import fileinput

for i in range(1, 176):
    filename = 'filename%04d.html' % i
    for line in fileinput.input(filename, inplace=1):
        line = line.strip()
        if not '<script type' in line:
            print line
# end delete1.py

In this case, a list of files is not used since all the files have the same
name + an incremented number, and the same extension. All the
files were named: filename0001.html to filename0207.html. I was able
to work on a subset of the files within the stated range. Also useful! =)

Finally, here is the original snippet, (I wish I'd noted where I found it!):

# delete0.py
# 2007-09-14
# open 00.test and the lines with '<script type' are deleted.
import fileinput
for line in fileinput.input("00.test", inplace=1):
    line = line.strip()
    if not '<script type'in line:
        print line
# end delete0.py

This takes one file (00.test) and deletes the lines with '<script type' in them.

Finally, here is the Python documentation reference for 'fileinput'
http://docs.python.org/lib/module-fileinput.html

I've already placed these little scripts in my utility directory, as I'm sure
to use them again.

Many thanks to all the Tutors (esp. wormwood_3 and Kent).

Happy Programming!
-- 
bhaaluu at gmail dot com


More information about the Tutor mailing list