[Tutor] about counting files
Abdirizak abdi
a_abdi406@yahoo.com
Tue Apr 29 09:40:25 2003
--0-1171494775-1051590787=:79765
Content-Type: text/plain; charset=us-ascii
Hi, can ayone help me with this program, I am reading multiple files and I want to count each of these files, I tried different things but I couldn't get it right. here is the program: mport glob, getopt
import fileinput,re,shelve,linecache,sys
#from TextSplitter import TextSplitter aword =re.compile (r'<[^<>]*>|\b[\w-]+\b') #using xml as well.
index={} # Generate an index in file indexFileNamedef genIndex(indexFileName, extension):
fname='*.'+extension
for line in fileinput.input(glob.glob(fname)):
location = fileinput.filename(), fileinput.filelineno()
for word in aword.findall(line.lower()):
if word[0] != '<':
index.setdefault(word,[]).append(location)
print index # for testing
shelf = shelve.open(indexFileName,'n')
for word in index:
shelf[word] = index[word]
shelf.close()
if __name__ == '__main__':
import sys
for arg in sys.argv[1:]: genIndex(arg,'txt') thanks in advance
---------------------------------
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
--0-1171494775-1051590787=:79765
Content-Type: text/html; charset=us-ascii
<DIV>Hi,</DIV>
<DIV> </DIV>
<DIV>can ayone help me with this program, I am reading multiple files and I want to count each of these files, I tried different things but I couldn't get it right.</DIV>
<DIV> </DIV>
<DIV>here is the program:</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>mport glob, getopt<BR>import fileinput,re,shelve,linecache,sys<BR>#from TextSplitter import TextSplitter</DIV>
<DIV> </DIV>
<DIV>aword =re.compile (r'<[^<>]*>|\b[\w-]+\b') #using xml as well.<BR>index={}</DIV>
<DIV> </DIV>
<DIV># Generate an index in file indexFileName</DIV>
<DIV>def genIndex(indexFileName, extension):<BR> <BR> fname='*.'+extension<BR> <BR> for line in fileinput.input(glob.glob(fname)):<BR> location = fileinput.filename(), fileinput.filelineno()<BR> for word in aword.findall(line.lower()):<BR> if word[0] != '<':<BR> index.setdefault(word,[]).append(location)</DIV>
<DIV><BR> print index # for testing<BR> <BR> shelf = shelve.open(indexFileName,'n')<BR> for word in index:<BR> shelf[word] = index[word]<BR> shelf.close()</DIV>
<DIV><BR>if __name__ == '__main__':<BR> import sys<BR> for arg in sys.argv[1:]:</DIV>
<DIV> genIndex(arg,'txt')</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>thanks in advance</DIV>
<DIV><BR> </DIV><p><hr SIZE=1>
Do you Yahoo!?<br>
<a href="http://us.rd.yahoo.com/search/mailsig/*http://search.yahoo.com">The New Yahoo! Search</a> - Faster. Easier. Bingo.
--0-1171494775-1051590787=:79765--