Help sorting a list by file extension

Bengt Richter bokr at oz.net
Thu Aug 11 21:07:59 EDT 2005


On Fri, 12 Aug 2005 00:06:17 GMT, Peter A. Schott <paschott at no.yahoo.spamm.com> wrote:

>Trying to operate on a list of files similar to this:
>
>test.1
>test.2
>test.3
>test.4
>test.10
>test.15
>test.20
>
>etc.
>
>I want to sort them in numeric order instead of string order.  I'm starting with
>this code:
>
>import os
>
>for filename in [filename for filename in os.listdir(os.getcwd())]:
>	print filename
>	#Write to file, but with filenames sorted by extension
>
>
>Desired result is a file containing something like:
>C:\MyFolder\test.1,test.001
>C:\MyFolder\test.2,test.002
>C:\MyFolder\test.3,test.003
>C:\MyFolder\test.4,test.004
>C:\MyFolder\test.10,test.010
>C:\MyFolder\test.15,test.015
>C:\MyFolder\test.20,test.020
>
>I need to order by that extension for the file output.
>
>I know I've got to be missing something pretty simple, but am not sure what.
>Does anyone have any ideas on what I'm missing?
>
>Thanks.
>
Decorate with the integer value, sort, undecorate. E.g.,

 >>> namelist = """\
 ... test.1
 ... test.2
 ... test.3
 ... test.4
 ... test.10
 ... test.15
 ... test.20
 ... """.splitlines()
 >>> namelist
 ['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']

Just to show we're doing something
 >>> namelist.reverse()
 >>> namelist
 ['test.20', 'test.15', 'test.10', 'test.4', 'test.3', 'test.2', 'test.1']

this list comprehension makes a sequence of tuples like (20, 'test.20'), (15, 'test.15') etc.
and sorts them, and then takes out the name from the sorted (dec, name) tuple sequence.

 >>> [name for dec,name in sorted((int(nm.rsplit('.',1)[1]),nm) for nm in namelist)]
 ['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']

The lexical sort, for comparison:
 >>> sorted(namelist)
 ['test.1', 'test.10', 'test.15', 'test.2', 'test.20', 'test.3', 'test.4']

This depends on the extension being nicely splittable with a single '.', but that
should be the case for you I think, if you make sure you eliminate directory names
and file names that don't end that way. You can look before you leap or catch the
conversion exceptions, but to do that, you'll need a loop instead of a listcomprehension.

>>> [name for dec,name in sorted((int(nm.split('.')[1]),nm) for nm in namelist)]
['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']
>>> sorted(namelist)
['test.1', 'test.10', 'test.15', 'test.2', 'test.20', 'test.3', 'test.4']

Regards,
Bengt Richter



More information about the Python-list mailing list