walk directory & ignore all files/directories begin with '.'
MRAB
python at mrabarnett.plus.com
Thu May 13 15:10:52 EDT 2010
albert kao wrote:
> I want to walk a directory and ignore all the files or directories
> which names begin in '.' (e.g. '.svn').
> Then I will process all the files.
> My test program walknodot.py does not do the job yet.
> Python version is 3.1 on windows XP.
> Please help.
>
> [code]
> #!c:/Python31/python.exe -u
> import os
> import re
>
> path = "C:\\test\\com.comp.hw.prod.proj.war\\bin"
> for dirpath, dirs, files in os.walk(path):
> print ("dirpath " + dirpath)
> p = re.compile('\\\.(\w)+$')
> if p.match(dirpath):
> continue
> print ("dirpath " + dirpath)
> for dir in dirs:
> print ("dir " + dir)
> if dir.startswith('.'):
> continue
>
> print (files)
> for filename in files:
> print ("filename " + filename)
> if filename.startswith('.'):
> continue
> print ("dirpath filename " + dirpath + "\\" + filename)
> # process the files here
> [/code]
>
> C:\python>walknodot.py
> dirpath C:\test\com.comp.hw.prod.proj.war\bin
> dirpath C:\test\com.comp.hw.prod.proj.war\bin
> dir .svn
> dir com
> []
> dirpath C:\test\com.comp.hw.prod.proj.war\bin\.svn
> dirpath C:\test\com.comp.hw.prod.proj.war\bin\.svn
> ...
>
> I do not expect C:\test\com.comp.hw.prod.proj.war\bin\.svn to appear
> twice.
> Please help.
The problem is with your use of the 'match' method, which will look for
a match only at the start of the string. You need to use the 'search'
method instead.
The regular expression is also incorrect. The string literal:
'\\\.(\w)+$'
passes the characters:
\\.(\w)+$
to the re module as the regular expression, which will match a
backslash, then any character, then a word, then the end of the string.
What you want is:
\\\.\w+$
(you don't need the parentheses) which is best expressed as the 'raw'
string literal:
r'\\\.\w+$'
More information about the Python-list
mailing list