[Tutor] deleting one line in multiple files

wormwood_3 wormwood_3 at yahoo.com
Fri Sep 14 05:33:48 CEST 2007


I think the problem is that the original script you borrowed looks at the file passed to input, and iterates over the lines in that file, removing them if they match your pattern. What you actually want to be doing is iterating over the lines of your list file, and for each line (which represents a file), you want to open *that* file, do the check for your pattern, and delete appropriately.

Hope I am not completely off:-)

If I am right so far, you want to do something like:

import fileinput

for file in fileinput.input("filelist.list", inplace=1):
    curfile = file.open()
    for line in curfile:
        line = line.strip()
        if not '<script type'in line:
            print line

BUT, fileinput was made (if I understand the documentation) to avoid having to do this. This is where the sys.argv[1:] values come in. The example on this page (look under "Processing Each Line of One or More Files:
The fileinput Module") helped clarify it to me: http://www.oreilly.com/catalog/lpython/chapter/ch09.html. If you do:

% python myscript.py "<script type" `ls`
This should pass in all the items in the folder you run this in (be sure it only contains the files you want to edit!), looking for "<script type". Continuing with the O'Reilly example:

import fileinput, sys, string
# take the first argument out of sys.argv and assign it to searchterm
searchterm, sys.argv[1:] = sys.argv[1], sys.argv[2:]
for line in fileinput.input():
   num_matches = string.count(line, searchterm)
   if num_matches:                     # a nonzero count means there was a match
       print "found '%s' %d times in %s on line %d." % (searchterm, num_matches, 
           fileinput.filename(), fileinput.filelineno())

To test this, I put the above code block in "mygrep.py", then made a file "test.txt" in the same folder, with some trash lines, and 1 line with the string you said you want to match on. Then I did:

sam at B74kb0x:~$ python mygrep.py "<script type" test.txt 
found '<script type' 1 times in test.txt on line 3.

So you could use the above block, and edit the print line to also edit the file as you want, maybe leaving the print to confirm it did what you expect.

Hope this helps!
-Sam

_____________________________________
I have a directory of files, and I've created a file list
of the files I want to work on:

$ ls > file.list

Each file in file.list needs to have a line removed,
leaving the rest of the file intact.

I found this snippet on the Net, and it works fine for one file:

# the lines with '<script type' are deleted.
import fileinput

for line in fileinput.input("file0001.html", inplace=1):
    line = line.strip()
    if not '<script type'in line:
        print line

The docs say:
This iterates over the lines of all files listed in sys.argv[1:]...
I'm not sure how to implement the argv stuff.

However, the documentation also states:
To specify an alternative list of filenames,
pass it as the first argument to input().
A single file name is also allowed.

So, when I replace file0001.html with file.list (the alternative list
of filenames, nothing happens.

# the lines with '<script type' are deleted.
import fileinput

for line in fileinput.input("file.list", inplace=1):
    line = line.strip()
    if not '<script type'in line:
        print line

file.list has one filename on each line, ending with a newline.
file0001.html
file0002.html
:::
:::
file0175.html

Have I interpreted the documentation wrong?
The goal is to delete the line that has '<script type' in it.
I can supply more information if needed.
TIA.
-- 
bhaaluu at gmail dot com
_______________________________________________
Tutor maillist  -  Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor








More information about the Tutor mailing list