Sorting directory contents

Jussi Salmela tiedon_jano at hotmail.com
Wed Feb 21 07:45:53 EST 2007


Larry Bates kirjoitti:
> Wolfgang Draxinger wrote:
>> Jussi Salmela wrote:
>>
>>> I'm not claiming the following to be more elegant, but I would
>>> do it like this (not tested!):
>>>
>>> src_file_paths = dict()
>>> prefix = sourcedir + os.sep
>>> for fname in os.listdir(sourcedir):
>>>      if match_fname_pattern(fname):
>>>          fpath = prefix + fname
>>>          src_file_paths[os.stat(fpath).st_mtime] = fpath
>>> for ftime in src_file_paths.keys().sort():
>>>          read_and_concatenate(src_file_paths[ftime])
>> Well, both versions, mine and yours won't work as it was written
>> down, as they neglegt the fact, that different files can have
>> the same st_mtime and that <listtype>.sort() doesn't return a
>> sorted list.
>>
>> However this code works (tested) and behaves just like listdir,
>> only that it sorts files chronologically, then alphabetically.
>>
>> def listdir_chrono(dirpath):
>>         import os
>>         files_dict = dict()
>>         for fname in os.listdir(dirpath):
>>                 mtime = os.stat(dirpath+os.sep+fname).st_mtime
>>                 if not mtime in files_dict:
>>                         files_dict[mtime] = list()
>>                 files_dict[mtime].append(fname)
>>         
>>         mtimes = files_dict.keys()
>>         mtimes.sort()
>>         filenames = list()
>>         for mtime in mtimes:
>>                 fnames = files_dict[mtime]
>>                 fnames.sort()
>>                 for fname in fnames:
>>                         filenames.append(fname)
>>         return filenames
>>
>> Wolfgang Draxinger
> 
> Four suggestions:
> 
> 1) You might want to use os.path.join(dirpath, fname) instead of
>    dirpath+os.sep+fname.
> 
> 2) You may be able to use glob.glob(<pattern>) to filter the files
>    more easily.
> 
> 3) You didn't handle the possibility that there is s subdirectory
>    in the current directory.  You need to check to make sure it is
>    a file you are processing as os.listdir() returns files AND
>    directories.
> 
> 4) If you just put a tuple containing (mtime, filename) in a list
>    each time through the loop you can just sort that list at the
>    end it will be sorted by mtime and then alphabetically.
> 
> Example (not tested):
> 
> def listdir_chrono(dirpath):
>     import os
>     #
>     # Get a list of full pathnames for all the files in dirpath
>     # and exclude all the subdirectories.  Note: This might be
>     # able to be replaced by glob.glob() to simplify.  I would then
>     # add a second optional parameter: mask="" that would allow me
>     # to pass in a mask.
>     #
>     # List comprehensions are our friend when we are processing
>     # lists of things.
>     #
>     files=[os.path.join(dirpath, x) for x in os.listdir(dirpath)
>            if not os.path.isdir(os.path.join(dirpath, x)]
> 
>     #
>     # Get a list of tuples that contain (mtime, filename) that
>     # I can sort.
>     #
>     flist=[(os.stat(x).st_mtime, x) for x in files]
> 
>     #
>     # Sort them.  Sort will sort on mtime, then on filename
>     #
>     flist.sort()
>     #
>     # Extract a list of the filenames only and return it
>     #
>     return [x[1] for x in flist]
>     #
>     # or if you only want the basenames of the files
>     #
>     #return [os.path.basename(x[1]) for x in flist]
> 
> 
> 
> -Larry Bates
> 

And as in Peter Ottens glob.glob variation, this shortens considerably 
by using sort with key instead of a separate list flist:

	files.sort(key=lambda x:(os.stat(x).st_mtime, x))

Cheers,
Jussi



More information about the Python-list mailing list