[Tutor] a quick Q: how to use for loop to read a series of files with .doc end

Alan Gauld alan.gauld at btinternet.com
Fri Oct 7 12:06:11 CEST 2011

On 07/10/11 09:08, lina wrote:

> INFILEEXT=".xpm"
> def dofiles(topdirectory):
>      for filename in os.listdir(topdirectory):
>          processfile(filename)
> def processfile(infilename):
>      results={}
>      base, ext =os.path.splitext(infilename)
>      if ext == INFILEEXT:

>          text = fetchonefiledata(infilename)
>          numcolumns=len(text[0])
>          for ch in TOKENS:
>              results[ch] = [0]*numcolumns
>          for line in text:
>              line = line.strip()
>          for col, ch in enumerate(line):
>              if ch in TOKENS:
>                  results[ch][col]+=1

It would be easier to read(and debug) if you put
that chunk into a function. Using the naming style below
it could be called processOneFileData() for example...

Make it return the results dictionary.

>      for k,v in results.items():
>          print(results)

This prints the same thing (results) for as many items
are in results. I'm pretty sure you don't want that.
Just printing results once should be sufficient.

>      summary=[]
>      for a,b in zip(results['E'],results['B']):
>          summary.append(a+b)

I don't know why this gives a key error on 'E' (which basically means 
that there is no key 'E') since the code above should guarantee that it 
exists. Odd. I'm also not sure why the error occurs after it prints 
summary. Are you sure the output is in the sequence you showed in your 

>      print(summary)
>      writeonefiledata(base+OUTFILEEXT,summary)
> def fetchonefiledata(inname):
>      infile = open(inname)
>      text = infile.readlines()
>      return text[LINESTOSKIP:]
> def writeonefiledata(outname,summary):
>      outfile = open(outname,"w")
>      for elem in summary:
>          outfile.write(str(summary))
> if __name__=="__main__":
>      dofiles(".")


Alan G
Author of the Learn to Program web site

More information about the Tutor mailing list