[Tutor] deleting one line in multiple files
bhaaluu
bhaaluu at gmail.com
Fri Sep 14 15:09:15 CEST 2007
Greetings,
Okay, I've had a chance to experiment with both solutions, and
both of them work as advertised. For those just tuning in, what
I wanted to do was delete a line in each file in a directory that
had a specific pattern in the line. There were over two hundred
files in the directory, and I quickly discovered that deleting the
line in each file, manually, was going to take quite a while. I'm
learning Python, so I set out to find a Python solution. I stumbled
upon the 'fileinput' module, specifically, a snippet that took a
filename and deleted the line 'in place' (no need for a temporary
third file). A blank line was left where the deleted line was, but
that was not a problem in this case.
The most flexible script is the one that uses a list of filenames, that
need to be worked on, in the directory. In this case, it doesn't matter
if the files are incrementaly numbered or not. I like this solution the
best because it is the most versatile.
1. Create the list of files that need work:
$ ls > file.list
2. Run the Python script:
$ python delete2.py
That's it! =)
Here's the script:
# delete2.py
# 2007-09-14
# the lines with '<script type' are deleted.
import fileinput
filenames = open("file.list").read().splitlines()
for line in fileinput.input(filenames, inplace=1):
line = line.strip()
if not '<script type' in line:
print line
# end delete2.py
I've already posted the other solution, which iterates through incrementally
numbered files, but for the purpose of Summary, here it is again:
# delete1.py
# 2007-09-14
# the lines with '<script type' are deleted.
import fileinput
for i in range(1, 176):
filename = 'filename%04d.html' % i
for line in fileinput.input(filename, inplace=1):
line = line.strip()
if not '<script type' in line:
print line
# end delete1.py
In this case, a list of files is not used since all the files have the same
name + an incremented number, and the same extension. All the
files were named: filename0001.html to filename0207.html. I was able
to work on a subset of the files within the stated range. Also useful! =)
Finally, here is the original snippet, (I wish I'd noted where I found it!):
# delete0.py
# 2007-09-14
# open 00.test and the lines with '<script type' are deleted.
import fileinput
for line in fileinput.input("00.test", inplace=1):
line = line.strip()
if not '<script type'in line:
print line
# end delete0.py
This takes one file (00.test) and deletes the lines with '<script type' in them.
Finally, here is the Python documentation reference for 'fileinput'
http://docs.python.org/lib/module-fileinput.html
I've already placed these little scripts in my utility directory, as I'm sure
to use them again.
Many thanks to all the Tutors (esp. wormwood_3 and Kent).
Happy Programming!
--
bhaaluu at gmail dot com
More information about the Tutor
mailing list