Python glob and raw string
Neil Cerutti
neilc at norwich.edu
Thu Jan 16 13:14:30 EST 2014
On 2014-01-16, Xaxa Urtiz <urtizvereaxaxa at gmail.com> wrote:
> Hello everybody, i've got a little problem, i've made a script
> which look after some files in some directory, typically my
> folder are organized like this :
>
> [share]
> folder1
> ->20131201
> -->file1.xml
> -->file2.txt
> ->20131202
> -->file9696009.tmp
> -->file421378932.xml
> etc....
> so basically in the share i've got some folder
> (=folder1,folder2.....) and inside these folder i've got these
> folder whose name is the date (20131201,20131202,20131203
> etc...) and inside them i want to find all the xml files.
> So, what i've done is to iterate over all the folder1/2/3 that
> i want and look, for each one, the xml file with that:
>
> for f in glob.glob(dir +r"\20140115\*.xml"):
> ->yield f
>
> dir is the folder1/2/3 everything is ok but i want to do
> something like that :
>
> for i in range(10,16):
> ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
> -->yield f
>
> but the glob does not find any file.... (and of course there is
> some xml and the old way found them...)
> Any help would be appreciate :)
I've done this two different ways. The simple way is very similar
to what you are now doing. It sucks because I have to manually
maintain the list of subdirectories to traverse every time I
create a new subdir.
Here's the other way, using glob and isdir from os.path, adapted
from actual production code.
class Miner:
def __init__(self, archive):
# setup goes here; prepare to acquire the data
self.descend(os.path.join(archive, '*'))
def descend(self, path):
for fname in glob.glob(os.path.join(path, '*')):
if os.path.isdir(fname):
self.descend(fname)
else:
self.process(fname)
def process(self, path):
# Do what I want done with an actual file path.
# This is where I add to the data.
In your case you might not want to process unless the path also
looks like an xml file.
mine = Miner('myxmldir')
Hmmm... I might be doing too much in __init__. ;)
--
Neil Cerutti
More information about the Python-list
mailing list