Sorting directory contents

Larry Bates lbates at websafe.com
Tue Feb 20 10:00:56 EST 2007


Wolfgang Draxinger wrote:
> Jussi Salmela wrote:
> 
>> I'm not claiming the following to be more elegant, but I would
>> do it like this (not tested!):
>>
>> src_file_paths = dict()
>> prefix = sourcedir + os.sep
>> for fname in os.listdir(sourcedir):
>>      if match_fname_pattern(fname):
>>          fpath = prefix + fname
>>          src_file_paths[os.stat(fpath).st_mtime] = fpath
>> for ftime in src_file_paths.keys().sort():
>>          read_and_concatenate(src_file_paths[ftime])
> 
> Well, both versions, mine and yours won't work as it was written
> down, as they neglegt the fact, that different files can have
> the same st_mtime and that <listtype>.sort() doesn't return a
> sorted list.
> 
> However this code works (tested) and behaves just like listdir,
> only that it sorts files chronologically, then alphabetically.
> 
> def listdir_chrono(dirpath):
>         import os
>         files_dict = dict()
>         for fname in os.listdir(dirpath):
>                 mtime = os.stat(dirpath+os.sep+fname).st_mtime
>                 if not mtime in files_dict:
>                         files_dict[mtime] = list()
>                 files_dict[mtime].append(fname)
>         
>         mtimes = files_dict.keys()
>         mtimes.sort()
>         filenames = list()
>         for mtime in mtimes:
>                 fnames = files_dict[mtime]
>                 fnames.sort()
>                 for fname in fnames:
>                         filenames.append(fname)
>         return filenames
> 
> Wolfgang Draxinger

Four suggestions:

1) You might want to use os.path.join(dirpath, fname) instead of
   dirpath+os.sep+fname.

2) You may be able to use glob.glob(<pattern>) to filter the files
   more easily.

3) You didn't handle the possibility that there is s subdirectory
   in the current directory.  You need to check to make sure it is
   a file you are processing as os.listdir() returns files AND
   directories.

4) If you just put a tuple containing (mtime, filename) in a list
   each time through the loop you can just sort that list at the
   end it will be sorted by mtime and then alphabetically.

Example (not tested):

def listdir_chrono(dirpath):
    import os
    #
    # Get a list of full pathnames for all the files in dirpath
    # and exclude all the subdirectories.  Note: This might be
    # able to be replaced by glob.glob() to simplify.  I would then
    # add a second optional parameter: mask="" that would allow me
    # to pass in a mask.
    #
    # List comprehensions are our friend when we are processing
    # lists of things.
    #
    files=[os.path.join(dirpath, x) for x in os.listdir(dirpath)
           if not os.path.isdir(os.path.join(dirpath, x)]

    #
    # Get a list of tuples that contain (mtime, filename) that
    # I can sort.
    #
    flist=[(os.stat(x).st_mtime, x) for x in files]

    #
    # Sort them.  Sort will sort on mtime, then on filename
    #
    flist.sort()
    #
    # Extract a list of the filenames only and return it
    #
    return [x[1] for x in flist]
    #
    # or if you only want the basenames of the files
    #
    #return [os.path.basename(x[1]) for x in flist]



-Larry Bates




More information about the Python-list mailing list