Regarding sort()

Mon May 25 04:05:59 EDT 2009

Dhananjay wrote:

> Hello All,
> 
> I have data set as follows:
> 
> 24       GLU    3       47      LYS     6       3.909233        1
> 42       PRO    5       785     VAL     74      4.145114     1
> 54       LYS    6       785     VAL     74      4.305017      1
> 55       LYS    6       785     VAL     74      4.291098      1
> 56       LYS    7       785     VAL     74      3.968647      1
> 58       LYS    7       772     MET     73      4.385121       1
> 58       LYS    7       778     MET     73      4.422980       1
> 58       LYS    7       779     MET     73      3.954990       1
> 58       LYS    7       785     VAL     74      3.420554       1
> 59       LYS    7       763     GLN     72      4.431955       1
> 59       LYS    7       767     GLN     72      3.844037       1
> 59       LYS    7       785     VAL     74      3.725048       1
> 
> 
> 
> 
> I want to sort the data on the basis of 3rd column first and latter want
> to sort the sorted data (in first step) on the basis of 6th column.
> 
> I tried sort() function but could not get the way how to use it.
> 
> I am new to programming, please tell me how can I sort.
> 
> Thanking you in advance ........

>>> data = """24       GLU    3       47      LYS     6       3.909233        
1
... 42       PRO    5       785     VAL     74      4.145114     1
... 54       LYS    6       785     VAL     74      4.305017      1
... 55       LYS    6       785     VAL     74      4.291098      1
... 56       LYS    7       785     VAL     74      3.968647      1
... 58       LYS    7       772     MET     73      4.385121       1
... 58       LYS    7       778     MET     73      4.422980       1
... 58       LYS    7       779     MET     73      3.954990       1
... 58       LYS    7       785     VAL     74      3.420554       1
... 59       LYS    7       763     GLN     72      4.431955       1
... 59       LYS    7       767     GLN     72      3.844037       1
... 59       LYS    7       785     VAL     74      3.725048       1
... """
>>> rows = data.splitlines()
>>> rows.sort(key=lambda line: int(line.split()[5]))
>>> rows.sort(key=lambda line: int(line.split()[2]))
>>> print "\n".join(rows)
24       GLU    3       47      LYS     6       3.909233        1
42       PRO    5       785     VAL     74      4.145114     1
54       LYS    6       785     VAL     74      4.305017      1
55       LYS    6       785     VAL     74      4.291098      1
59       LYS    7       763     GLN     72      4.431955       1
59       LYS    7       767     GLN     72      3.844037       1
58       LYS    7       772     MET     73      4.385121       1
58       LYS    7       778     MET     73      4.422980       1
58       LYS    7       779     MET     73      3.954990       1
56       LYS    7       785     VAL     74      3.968647      1
58       LYS    7       785     VAL     74      3.420554       1
59       LYS    7       785     VAL     74      3.725048       1

Python's list.sort() is "stable". Therefore you can sort by the 6th column 
first, and then by the third. Rows with the same value in the third column 
will not change their relative position.

To calculate the value for the n-th row you can either use the above lambda 
or the equivalent function:

def sixth_column(row):
    columns = row.split() # split by whitespace
    column = columns[5] # sixth column
    return int(column) # convert to integer

If you don't convert to integer a row with "10" comes before a row with "2" 
in the column which is probably not what you want.

Peter