combining the path and fileinput modules SOLVED

wo_shi_big_stomach wo_shi_big_stomach at mac.com
Sun Nov 26 05:29:32 CET 2006


Dennis Lee Bieber wrote:
> On Sat, 25 Nov 2006 07:58:26 -0800, wo_shi_big_stomach
> <wo_shi_big_stomach at mac.com> declaimed the following in
> comp.lang.python:
> 
>>   File "p2.py", line 23, in ?
>>     for line in fileinput.input(f,):
>>   File
>> "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/fileinput.py",
>> line 231, in next
>>     line = self.readline()
>>   File
>> "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/fileinput.py",
>> line 320, in readline
>>     self._file = open(self._filename, "r")
>>
>> This looks similar to before -- fileinput.input() still isn't operating
>> on the input.
>>
> 	And where is the actual exception message line -- the one with the
> error code/description.
>  
> 
>> dir = path('/home/wsbs/Maildir')
>> #for f in dir.walkfiles('*'):
>> for f in filter(os.path.isfile, dir.walkfiles('*')):
> 
> 	If I understand the documentation of fileinput, you shouldn't even
> need this output loop; fileinput is designed to expect a list of files
> (that it works with a single file seems an afterthought)

Yes, thanks. This is the key point.

Feeding fileinput.input() a list rather than a single file (or whatever
it's called in Python) got my program working. Thanks!

> 
>> 	for line in fileinput.input(f,):
> 	for line in fileinput.input(filter(os.path.isfile, 
> 							dir.walkfiles("*")),
> 							inplace=1):
> 
> should handle all the files... 

Indeed it does -- too many times.

Sorry, but this (and the program you provided) iterate over the entire
list N times, where N is the number of files, rather than doing one
iteration on each file.

For instance, using your program with inplace editing and a ".bak" file
extension for the originals, I ended up with filenames like
name.bak.bak.bak.bak.bak in a directory with five files in it.

I don't have this third party path
> module, so the directory tree walking isn't active, but...

The path module:

http://www.jorendorff.com/articles/python/path/

is a *lot* cleaner than os.path; see the examples at that URL.

Thanks for the great tip about fileinput.input(), and thanks to all who
answered my query. I've pasted the working code below.

/wsbs

import fileinput
import os
import re
import string
import sys
from path import path

# p2.py - fix broken SMTP headers in email files
#
# recurses from dir and searches all subdirs
# for each file, evaluates whether 1st line starts with "From "
# for each match, program deletes line

# recurse dirs
dir = path('/home/wsbs/Maildir')
g = dir.walkfiles('*')
for line in fileinput.input(g, inplace=1, backup='.bak'):
# just print 2nd and subsequent lines
	if not fileinput.isfirstline():
		print line.rstrip('\n')
	# check first line only
	elif fileinput.isfirstline():
		if not re.search('^From ',line):
			print line.rstrip('\n')
fileinput.close()



More information about the Python-list mailing list