[Tutor] Sorting

Daniel Yoo dyoo@hkn.EECS.Berkeley.EDU
Wed, 26 Jan 2000 07:22:34 -0800 (PST)


On Tue, 25 Jan 2000, Ben Beuchler wrote:

> > lines.sort(lambda x,y: x[3] > y[3])
> > for x in lines: print rstrip(x)
>  
> I have to admit, the ability to pass extra arguments to x.sort() confuses
> me.  Can you clarify a little what is happening in that lambda expression
> and how it talks sort into dealing with the fourth column instead of the
> first?

The way that sort() works is that it compares pairs of elements to figure
out which element is smaller than another.  If it finds that the two
elements are misplaced, it'll swap them and reduce the disorder of the
list. If it does this enough times on different elements, eventually, the
whole list gets sorted.  More sophisticated sorting algorithms do more
work, but essentially, this is the essence of comparison sorting.

By default, since we're working with strings, it will compare the two
elements based on lexiographic order, I think.  However, we want to change
the notion of which elements should come first in a sorted list.  sort()
supports this by letting us change the comparison function.  From the
Python reference library:

"The sort() method takes an optional argument specifying a comparison
function of two arguments (list items) which should return -1, 0 or 1
depending on whether the first argument is considered smaller than, equal
to, or larger than the second argument. Note that this slows the sorting
process down considerably; e.g. to sort a list in reverse order it is much
faster to use calls to the methods sort() and reverse() than to use the
built-in function sort() with a comparison function that reverses the
ordering of the elements."

The lambda expression, strictly speaking, shouldn't work, since it only
returns either 0 or 1.  I was careless.  Sorry about that.  *grin* But if
we have a function like:

def comparethirdcolumn(line1, line2):
  if line1[2] < line2[2]: return -1
  elsif line1[2] > line2[2]: return 1
  else: return 0

then it should be ok to call lines.sort(comparethirdcolumn).  If you have
any more questions, please send it to the whole python list too, just in
case I blundered anywhere.