[Tutor] methods of sorting

Steven D'Aprano steve at pearwood.info
Wed Apr 23 03:36:51 CEST 2014

Hi Patti,

My answers below, interleaved between your questions.

On Tue, Apr 22, 2014 at 04:18:38PM -0700, Patti Scott wrote:

> I'm practicing with lists.  I was looking for documentation on sorting 
> with cmp() because it isn't immediately clear to me how comparing 
> items two at a time can sort the entire list.  Identify max or min 
> values, yes, but not sort the whole list.  So, the Sorting HOW TO 
> (Dalke, Hettinger)  posted on python.org goes into detail on using a 
> key parameter for sorted() and .sort(), and using operator module 
> functions. 

Think about how you might sort four items 42, 23, 57, 30. There are many 
different ways to sort, and this is one of the least efficient, but 
easiest to understand. We start by putting the unsorted items on the 
left, and the sorted items on the right, as if we were sorting a handful 
of playing cards:

[42, 23, 57, 30] []

Take the first item from the left, and find where it belongs on the 
right. Since the right is currently empty, that's easy:

[23, 57, 30] [42]

Now take the next item from the left, and find where it belongs on the 
right. How do you do that? By comparing it to each item already there. 
If it compares less than the item, insert it just before the item; 
otherwise keep going.

In this case, we compare 23 < 42, which returns True, so we insert 23 to 
the left of 42.

[57, 30] [23, 42]

Now repeat with the next item. In this case, 57 < 23 returns False, so 
we continue. 57 < 42 also returns False, and there are no more numbers 
to check so we put 57 at the end:

[30] [23, 42, 57]

Finally we compare 30 < 23, which returns False, then 30 < 42, which 
returns True, so we insert 30 just to the left of 42:

[] [23, 30, 42, 57]

Now that we know how to sort using "less than" < as the comparison 
function, we can use some other comparison function that works 
similarly. Instead of using < we can use the built-in function 
cmp(a, b), which returns -1 if a < b, 0 if a == b, and +1 if a > b.

Or instead of using the built-in cmp function, we can use any function 
that takes two arguments, the items to be compared, and returns one of 
-1, 0 or 1.

Even though the Python list.sort() method is a lot faster and more 
clever than what I show above, it too allows you to provide a custom 
comparison function to decide which comes earlier or later when sorting.
Here's an example with and without a comparison function:

py> sorted(['dog', 'aardvark', 'chicken', 'horse'])
['aardvark', 'chicken', 'dog', 'horse']

py> sorted(['dog', 'aardvark', 'chicken', 'horse'], 
...   lambda a, b: cmp(len(a), len(b)))
['dog', 'horse', 'chicken', 'aardvark']

In the second case, we sort by the length of the words, not the content 
of the word. So "dog" (three letters) compares less than "aardvark" 
(eight letters). A couple of other notes:

- Rather than define a comparison function using def, I use lambda as 
  a shortcut. lambda creates a function, but limited only to a single
  expression. So "lambda a, b: cmp(len(a), len(b))" is equivalent to:

  def function(a, b):
      return cmp(len(a), len(b))

- Notice that I use the built-in cmp function inside my comparison 
  function. That's just for convenience, you don't have to do that.

> How obsolete are the cmp() and the decorate-sort-undecorate methods?  
> To be understood but probably not used in new code?

Both are very obsolute, but for different reasons.

The problem with using a comparison function is that it is very 
inefficient and it can really slow down sorting of large lists by a lot. 
It is better to use the DSU idiom rather than call a comparison 
function. In fact, that is so much better, that recent versions of 
Python make the DSU idiom built-in: that's what the "key" argument to 
the sort() and sorted() functions is for. When you supply a key function 
to sort, it internally uses the DSU idiom. You almost never need to use 
it yourself.

Using the key function is so much better than using a comparison 
function that in Python 3 the comparison function was dropped 
altogether and using key is the only way to customize sorting.

> Python Programming, Zelle;  Python 2.7.3,  PowerShell, Notepad ++
> I tried several means of sorting for exercises, eg
[lots of code shown]

I'm sorry, did you have a question about the sorting code or were you 
just sharing it with us? If you're asking which should be preferred, I 
would prefer the version using the key=... argument to sort.


More information about the Tutor mailing list