combining the path and fileinput modules SOLVED
wo_shi_big_stomach
wo_shi_big_stomach at mac.com
Sat Nov 25 23:29:32 EST 2006
Dennis Lee Bieber wrote:
> On Sat, 25 Nov 2006 07:58:26 -0800, wo_shi_big_stomach
> <wo_shi_big_stomach at mac.com> declaimed the following in
> comp.lang.python:
>
>> File "p2.py", line 23, in ?
>> for line in fileinput.input(f,):
>> File
>> "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/fileinput.py",
>> line 231, in next
>> line = self.readline()
>> File
>> "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/fileinput.py",
>> line 320, in readline
>> self._file = open(self._filename, "r")
>>
>> This looks similar to before -- fileinput.input() still isn't operating
>> on the input.
>>
> And where is the actual exception message line -- the one with the
> error code/description.
>
>
>> dir = path('/home/wsbs/Maildir')
>> #for f in dir.walkfiles('*'):
>> for f in filter(os.path.isfile, dir.walkfiles('*')):
>
> If I understand the documentation of fileinput, you shouldn't even
> need this output loop; fileinput is designed to expect a list of files
> (that it works with a single file seems an afterthought)
Yes, thanks. This is the key point.
Feeding fileinput.input() a list rather than a single file (or whatever
it's called in Python) got my program working. Thanks!
>
>> for line in fileinput.input(f,):
> for line in fileinput.input(filter(os.path.isfile,
> dir.walkfiles("*")),
> inplace=1):
>
> should handle all the files...
Indeed it does -- too many times.
Sorry, but this (and the program you provided) iterate over the entire
list N times, where N is the number of files, rather than doing one
iteration on each file.
For instance, using your program with inplace editing and a ".bak" file
extension for the originals, I ended up with filenames like
name.bak.bak.bak.bak.bak in a directory with five files in it.
I don't have this third party path
> module, so the directory tree walking isn't active, but...
The path module:
http://www.jorendorff.com/articles/python/path/
is a *lot* cleaner than os.path; see the examples at that URL.
Thanks for the great tip about fileinput.input(), and thanks to all who
answered my query. I've pasted the working code below.
/wsbs
import fileinput
import os
import re
import string
import sys
from path import path
# p2.py - fix broken SMTP headers in email files
#
# recurses from dir and searches all subdirs
# for each file, evaluates whether 1st line starts with "From "
# for each match, program deletes line
# recurse dirs
dir = path('/home/wsbs/Maildir')
g = dir.walkfiles('*')
for line in fileinput.input(g, inplace=1, backup='.bak'):
# just print 2nd and subsequent lines
if not fileinput.isfirstline():
print line.rstrip('\n')
# check first line only
elif fileinput.isfirstline():
if not re.search('^From ',line):
print line.rstrip('\n')
fileinput.close()
More information about the Python-list
mailing list