[Tutor] Sorting numbers
Jeff Shannon
jeff@ccvcorp.com
Tue Feb 4 14:58:02 2003
Adam Vardy wrote:
>If I'd like to sort a simple kinda list like:
>
>45K
>100K
>3.4Meg
>17K
>300K
>9.3Meg
>
>How do you suppose I can approach it?
>
Are your suffixes (K, Meg, etc) standardized? If so, you can use that
to separate your list into sublists, sort each sublist, and then present
the sublists in appropriate order. When sorting sublists, you'll have
to be careful, though -- you're dealing with strings, and you want them
sorted in numeric order instead of alphabetical order. (In alphabetical
order, '35' comes after '300'.)
I'd split each string into a numeric part and a suffix. Use the suffix
as a dictionary key, and add the numeric part to a list pointed to by
that key. Then, when you sort each list, convert each numeric string to
a float for sorting comparisons -- but *only* use the float for sorting,
and then display the original string value.
Let's start by writing a function that'll split the suffix from the
numeric part:
>>> def splitsuffix(item):
... for suffix in ['K', 'Meg', 'Gig']:
... if item.endswith(suffix):
... trim = -( len(suffix) )
... numpart = item[:trim]
... return (numpart, suffix)
... # If no suffix matches, we have a plain number
... return (item, '')
...
>>>
Now, we'll take our raw data, and process that into a dictionary using
our splitsuffix() function:
>>> rawdata
['45K', '100K', '3.4Meg', '17K', '300K', '9.3Meg', '512', '23Meg']
>>> srt = {}
>>> for item in rawdata:
... num, suffix = splitsuffix(item)
... value = srt.get(suffix, [])
... value.append(num)
... srt[suffix] = value
...
>>> srt
{'': ['512'], 'K': ['45', '100', '17', '300'], 'Meg': ['3.4', '9.3', '23']}
>>>
Now, we're going to need to sort our strings by numeric order, so let's
define a quick sortfunction that'll do that:
>>> def sortfunc(a, b):
... return cmp( float(a), float(b) )
...
>>>
Now we're set to grab each sublist, sort it, and then display it.
>>> for key in ['', 'K', 'Meg', 'Gig']:
... values = srt.get(key, [])
... values.sort(sortfunc)
... for item in values:
... print '%5s%s' % (item, key)
...
512
17K
45K
100K
300K
3.4Meg
9.3Meg
23Meg
>>>
That looks like the order we want!
An alternative approach would be to write a function that converts, say,
'17K' to the integer value 17000, and '3.4Meg' to 3,400,000. Then you
could sort your raw data based on the results of that function.
A good way to convert these values would be to make a dictionary that
links a given suffix to a multiplier. Then you can separate the numeric
part from the suffix (we already know how to do that), use the suffix to
get the multiplier, do the math and return the result. And once we have
a function to expand these numbers, we can simply write a comparison
function that uses the expanded numbers for sorting.
>>> rawdata
['45K', '100K', '3.4Meg', '17K', '300K', '9.3Meg', '512', '23Meg']
>>> suff = { '':1, 'K':1000, 'Meg':1000000, 'Gig':1000000000 }
>>> def expand(item, suffixes = suff):
... numpart, suffixpart = splitsuffix(item)
... multiplier = suffixes[suffixpart]
... return float(numpart) * multiplier
...
>>> def sortfunc(a, b):
... return cmp(expand(a), expand(b))
...
>>> rawdata.sort(sortfunc)
>>> rawdata
['512', '17K', '45K', '100K', '300K', '3.4Meg', '9.3Meg', '23Meg']
>>>
Here I've sorted the data in-place. If you need to leave the original
data alone for whatever reason, you can simply make a copy of the list (
sortedlist = rawdata[:] ) and then sort the new list.
Jeff Shannon
Technician/Programmer
Credit International