Matching Directory Names and Grouping Them
Neil Cerutti
horpner at yahoo.com
Fri Jan 12 11:02:28 EST 2007
On 2007-01-11, J <wilder.usenet at gmail.com> wrote:
> Steve-
>
> Thanks for the reply. I think what I'm trying to say by similar
> is pattern matching. Essentially, walking through a directory
> tree starting at a specified root folder, and returning a list
> of all folders that matches a pattern, in this case, a folder
> name containing a four digit number representing year and a
> subdirectory name containing a two digit number representing a
> month. The matches are grouped together and written into a text
> file. I hope this helps.
Here's a solution using itertools.groupby, just because this is
the first programming problem I've seen that seemed to call for
it. Hooray!
from itertools import groupby
def print_by_date(dirs):
r""" Group a directory list according to date codes.
>>> data = [
... "<root>/Input2/2002/03/",
... "<root>/Input1/2001/01/",
... "<root>/Input3/2005/05/",
... "<root>/Input3/2001/01/",
... "<root>/Input1/2002/03/",
... "<root>/Input3/2005/12/",
... "<root>/Input2/2001/01/",
... "<root>/Input3/2002/03/",
... "<root>/Input2/2005/05/",
... "<root>/Input1/2005/12/"]
>>> print_by_date(data)
<root>/Input1/2001/01/
<root>/Input2/2001/01/
<root>/Input3/2001/01/
<BLANKLINE>
<root>/Input1/2002/03/
<root>/Input2/2002/03/
<root>/Input3/2002/03/
<BLANKLINE>
<root>/Input2/2005/05/
<root>/Input3/2005/05/
<BLANKLINE>
<root>/Input1/2005/12/
<root>/Input3/2005/12/
<BLANKLINE>
"""
def date_key(path):
return path[-7:]
groups = [list(g) for _,g in groupby(sorted(dirs, key=date_key), date_key)]
for g in groups:
print '\n'.join(path for path in sorted(g))
print
if __name__ == "__main__":
import doctest
doctest.testmod()
I really wanted nested join calls for the output, to suppress
that trailing blank line, but I kept getting confused and
couldn't sort it out.
It would better to use the os.path module, but I couldn't find
the function in there lets me pull out path tails.
I didn't filter out stuff that didn't match the date path
convention you used.
--
Neil Cerutti
More information about the Python-list
mailing list