[Tutor] Sorting numbers

Jeff Shannon jeff@ccvcorp.com
Tue Feb 4 14:58:02 2003


Adam Vardy wrote:

>If I'd like to sort a simple kinda list like:
>
>45K
>100K
>3.4Meg
>17K
>300K
>9.3Meg
>
>How do you suppose I can approach it?
>

Are your suffixes (K, Meg, etc) standardized?  If so, you can use that 
to separate your list into sublists, sort each sublist, and then present 
the sublists in appropriate order.  When sorting sublists, you'll have 
to be careful, though -- you're dealing with strings, and you want them 
sorted in numeric order instead of alphabetical order.  (In alphabetical 
order,  '35' comes after '300'.)  

I'd split each string into a numeric part and a suffix.  Use the suffix 
as a dictionary key, and add the numeric part to a list pointed to by 
that key.  Then, when you sort each list, convert each numeric string to 
a float for sorting comparisons -- but *only* use the float for sorting, 
and then display the original string value.

Let's start by writing a function that'll split the suffix from the 
numeric part:

 >>> def splitsuffix(item):
...     for suffix in ['K', 'Meg', 'Gig']:
...         if item.endswith(suffix):
...             trim = -( len(suffix) )
...             numpart = item[:trim]
...             return (numpart, suffix)
...     # If no suffix matches, we have a plain number
...     return (item, '')
...
 >>>

Now, we'll take our raw data, and process that into a dictionary using 
our splitsuffix() function:

 >>> rawdata
['45K', '100K', '3.4Meg', '17K', '300K', '9.3Meg', '512', '23Meg']
 >>> srt = {}
 >>> for item in rawdata:
...     num, suffix = splitsuffix(item)
...     value = srt.get(suffix, [])
...     value.append(num)
...     srt[suffix] = value
...
 >>> srt
{'': ['512'], 'K': ['45', '100', '17', '300'], 'Meg': ['3.4', '9.3', '23']}
 >>>

Now, we're going to need to sort our strings by numeric order, so let's 
define a quick sortfunction that'll do that:

 >>> def sortfunc(a, b):
...     return cmp( float(a), float(b) )
...
 >>>

Now we're set to grab each sublist, sort it, and then display it.

 >>> for key in ['', 'K', 'Meg', 'Gig']:
...     values = srt.get(key, [])
...     values.sort(sortfunc)
...     for item in values:
...         print '%5s%s' % (item, key)
...        
  512
   17K
   45K
  100K
  300K
  3.4Meg
  9.3Meg
   23Meg
 >>>

That looks like the order we want!

An alternative approach would be to write a function that converts, say, 
'17K' to the integer value 17000, and '3.4Meg' to 3,400,000.  Then you 
could sort your raw data based on the results of that function.

A good way to convert these values would be to make a dictionary that 
links a given suffix to a multiplier.  Then you can separate the numeric 
part from the suffix (we already know how to do that), use the suffix to 
get the multiplier, do the math and return the result.  And once we have 
a function to expand these numbers, we can simply write a comparison 
function that uses the expanded numbers for sorting.

 >>> rawdata
['45K', '100K', '3.4Meg', '17K', '300K', '9.3Meg', '512', '23Meg']
 >>> suff = { '':1, 'K':1000, 'Meg':1000000, 'Gig':1000000000 }
 >>> def expand(item, suffixes = suff):
...     numpart, suffixpart = splitsuffix(item)
...     multiplier = suffixes[suffixpart]
...     return float(numpart) * multiplier
...
 >>> def sortfunc(a, b):
...     return cmp(expand(a), expand(b))
...
 >>> rawdata.sort(sortfunc)
 >>> rawdata
['512', '17K', '45K', '100K', '300K', '3.4Meg', '9.3Meg', '23Meg']
 >>>

Here I've sorted the data in-place.  If you need to leave the original 
data alone for whatever reason, you can simply make a copy of the list ( 
sortedlist = rawdata[:] ) and then sort the new list.

Jeff Shannon
Technician/Programmer
Credit International