parsing directory for certain filetypes
Tim Chase
python.list at tim.thechases.com
Mon Mar 10 11:03:59 EDT 2008
> i wrote a function to parse a given directory and make a sorted list
> of files with .txt,.doc extensions .it works,but i want to know if it
> is too bloated..can this be rewritten in more efficient manner?
>
> here it is...
>
> from string import split
> from os.path import isdir,join,normpath
> from os import listdir
>
> def parsefolder(dirname):
> filenms=[]
> folder=dirname
> isadr=isdir(folder)
> if (isadr):
> dirlist=listdir(folder)
> filenm=""
> for x in dirlist:
> filenm=x
> if(filenm.endswith(("txt","doc"))):
> nmparts=[]
> nmparts=split(filenm,'.' )
> if((nmparts[1]=='txt') or (nmparts[1]=='doc')):
> filenms.append(filenm)
> filenms.sort()
> filenameslist=[]
> filenameslist=[normpath(join(folder,y)) for y in filenms]
> numifiles=len(filenameslist)
> print filenameslist
> return filenameslist
>
>
> folder='F:/mysys/code/tstfolder'
> parsefolder(folder)
It seems to me that this is awfully baroque with many unneeded
superfluous variables. Is this not the same functionality (minus
prints, unused result-counting, NOPs, and belt-and-suspenders
extension-checking) as
def parsefolder(dirname):
if not isdir(dirname): return
return sorted([
normpath(join(dirname, fname))
for fname in listdir(dirname)
if fname.lower().endswith('.txt')
or fname.lower().endswith('.doc')
])
In Python2.5 (or 2.4 if you implement the any() function, ripped
from the docs[1]), this could be rewritten to be a little more
flexible...something like this (untested):
def parsefolder(dirname, types=['.doc', '.txt']):
if not isdir(dirname): return
return sorted([
normpath(join(dirname, fname))
for fname in listdir(dirname)
if any(
fname.lower().endswith(s)
for s in types)
])
which would allow you to do both
parsefolder('/path/to/wherever/')
and
parsefolder('/path/to/wherever/', ['.xls', '.ppt', '.htm'])
In both cases, you don't define the case where isdir(dirname)
fails. Caveat Implementor.
-tkc
[1] http://docs.python.org/lib/built-in-funcs.html
More information about the Python-list
mailing list