# [Tutor] methods of sorting

Steven D'Aprano steve at pearwood.info
Wed Apr 23 03:36:51 CEST 2014

```Hi Patti,

On Tue, Apr 22, 2014 at 04:18:38PM -0700, Patti Scott wrote:

> I'm practicing with lists.  I was looking for documentation on sorting
> with cmp() because it isn't immediately clear to me how comparing
> items two at a time can sort the entire list.  Identify max or min
> values, yes, but not sort the whole list.  So, the Sorting HOW TO
> (Dalke, Hettinger)  posted on python.org goes into detail on using a
> key parameter for sorted() and .sort(), and using operator module
> functions.

Think about how you might sort four items 42, 23, 57, 30. There are many
different ways to sort, and this is one of the least efficient, but
easiest to understand. We start by putting the unsorted items on the
left, and the sorted items on the right, as if we were sorting a handful
of playing cards:

[42, 23, 57, 30] []

Take the first item from the left, and find where it belongs on the
right. Since the right is currently empty, that's easy:

[23, 57, 30] [42]

Now take the next item from the left, and find where it belongs on the
right. How do you do that? By comparing it to each item already there.
If it compares less than the item, insert it just before the item;
otherwise keep going.

In this case, we compare 23 < 42, which returns True, so we insert 23 to
the left of 42.

[57, 30] [23, 42]

Now repeat with the next item. In this case, 57 < 23 returns False, so
we continue. 57 < 42 also returns False, and there are no more numbers
to check so we put 57 at the end:

[30] [23, 42, 57]

Finally we compare 30 < 23, which returns False, then 30 < 42, which
returns True, so we insert 30 just to the left of 42:

[] [23, 30, 42, 57]

Now that we know how to sort using "less than" < as the comparison
function, we can use some other comparison function that works
similarly. Instead of using < we can use the built-in function
cmp(a, b), which returns -1 if a < b, 0 if a == b, and +1 if a > b.

Or instead of using the built-in cmp function, we can use any function
that takes two arguments, the items to be compared, and returns one of
-1, 0 or 1.

Even though the Python list.sort() method is a lot faster and more
clever than what I show above, it too allows you to provide a custom
comparison function to decide which comes earlier or later when sorting.
Here's an example with and without a comparison function:

py> sorted(['dog', 'aardvark', 'chicken', 'horse'])
['aardvark', 'chicken', 'dog', 'horse']

py> sorted(['dog', 'aardvark', 'chicken', 'horse'],
...   lambda a, b: cmp(len(a), len(b)))
['dog', 'horse', 'chicken', 'aardvark']

In the second case, we sort by the length of the words, not the content
of the word. So "dog" (three letters) compares less than "aardvark"
(eight letters). A couple of other notes:

- Rather than define a comparison function using def, I use lambda as
a shortcut. lambda creates a function, but limited only to a single
expression. So "lambda a, b: cmp(len(a), len(b))" is equivalent to:

def function(a, b):
return cmp(len(a), len(b))

- Notice that I use the built-in cmp function inside my comparison
function. That's just for convenience, you don't have to do that.

> How obsolete are the cmp() and the decorate-sort-undecorate methods?
> To be understood but probably not used in new code?

Both are very obsolute, but for different reasons.

The problem with using a comparison function is that it is very
inefficient and it can really slow down sorting of large lists by a lot.
It is better to use the DSU idiom rather than call a comparison
function. In fact, that is so much better, that recent versions of
Python make the DSU idiom built-in: that's what the "key" argument to
the sort() and sorted() functions is for. When you supply a key function
to sort, it internally uses the DSU idiom. You almost never need to use
it yourself.

Using the key function is so much better than using a comparison
function that in Python 3 the comparison function was dropped
altogether and using key is the only way to customize sorting.

> Python Programming, Zelle;  Python 2.7.3,  PowerShell, Notepad ++
>
> I tried several means of sorting for exercises, eg
[lots of code shown]

I'm sorry, did you have a question about the sorting code or were you
just sharing it with us? If you're asking which should be preferred, I
would prefer the version using the key=... argument to sort.

--
Steven
```