Sorting directory contents
Jussi Salmela
tiedon_jano at hotmail.com
Wed Feb 21 07:45:53 EST 2007
Larry Bates kirjoitti:
> Wolfgang Draxinger wrote:
>> Jussi Salmela wrote:
>>
>>> I'm not claiming the following to be more elegant, but I would
>>> do it like this (not tested!):
>>>
>>> src_file_paths = dict()
>>> prefix = sourcedir + os.sep
>>> for fname in os.listdir(sourcedir):
>>> if match_fname_pattern(fname):
>>> fpath = prefix + fname
>>> src_file_paths[os.stat(fpath).st_mtime] = fpath
>>> for ftime in src_file_paths.keys().sort():
>>> read_and_concatenate(src_file_paths[ftime])
>> Well, both versions, mine and yours won't work as it was written
>> down, as they neglegt the fact, that different files can have
>> the same st_mtime and that <listtype>.sort() doesn't return a
>> sorted list.
>>
>> However this code works (tested) and behaves just like listdir,
>> only that it sorts files chronologically, then alphabetically.
>>
>> def listdir_chrono(dirpath):
>> import os
>> files_dict = dict()
>> for fname in os.listdir(dirpath):
>> mtime = os.stat(dirpath+os.sep+fname).st_mtime
>> if not mtime in files_dict:
>> files_dict[mtime] = list()
>> files_dict[mtime].append(fname)
>>
>> mtimes = files_dict.keys()
>> mtimes.sort()
>> filenames = list()
>> for mtime in mtimes:
>> fnames = files_dict[mtime]
>> fnames.sort()
>> for fname in fnames:
>> filenames.append(fname)
>> return filenames
>>
>> Wolfgang Draxinger
>
> Four suggestions:
>
> 1) You might want to use os.path.join(dirpath, fname) instead of
> dirpath+os.sep+fname.
>
> 2) You may be able to use glob.glob(<pattern>) to filter the files
> more easily.
>
> 3) You didn't handle the possibility that there is s subdirectory
> in the current directory. You need to check to make sure it is
> a file you are processing as os.listdir() returns files AND
> directories.
>
> 4) If you just put a tuple containing (mtime, filename) in a list
> each time through the loop you can just sort that list at the
> end it will be sorted by mtime and then alphabetically.
>
> Example (not tested):
>
> def listdir_chrono(dirpath):
> import os
> #
> # Get a list of full pathnames for all the files in dirpath
> # and exclude all the subdirectories. Note: This might be
> # able to be replaced by glob.glob() to simplify. I would then
> # add a second optional parameter: mask="" that would allow me
> # to pass in a mask.
> #
> # List comprehensions are our friend when we are processing
> # lists of things.
> #
> files=[os.path.join(dirpath, x) for x in os.listdir(dirpath)
> if not os.path.isdir(os.path.join(dirpath, x)]
>
> #
> # Get a list of tuples that contain (mtime, filename) that
> # I can sort.
> #
> flist=[(os.stat(x).st_mtime, x) for x in files]
>
> #
> # Sort them. Sort will sort on mtime, then on filename
> #
> flist.sort()
> #
> # Extract a list of the filenames only and return it
> #
> return [x[1] for x in flist]
> #
> # or if you only want the basenames of the files
> #
> #return [os.path.basename(x[1]) for x in flist]
>
>
>
> -Larry Bates
>
And as in Peter Ottens glob.glob variation, this shortens considerably
by using sort with key instead of a separate list flist:
files.sort(key=lambda x:(os.stat(x).st_mtime, x))
Cheers,
Jussi
More information about the Python-list
mailing list