[Tutor] sorting question

Kent Johnson kent37 at tds.net
Wed Mar 7 02:17:38 CET 2007


Rob Andrews wrote:
> I'm trying to think of the best way to go about this one, as the files
> I have to sort are *big*.
> 
> They're ASCII files with each row consisting of a series of
> fixed-length fields, each of which has a corresponding format file.
> (To be specific, these files are FirstLogic compatible.)
> 
> I'm looking to sort files such that I can produce the 50,000 records
> with the highest "score" in a certain field.
> 
> A grossly over-simplified example is:
> 
> "JohnDoe   3.14123 Anywhere St."
> "MarySmith11.03One Jackson Pl. "
> 
> ------------------------------------------------------------
>>>> for x in people: # substituting 'people' for a file of records
> 	print x[9:14]
> 	
>  3.14
> 11.03
> ------------------------------------------------------------
> 
> With this in mind, I'm trying to sort the file by the value of the
> number in the field represented by x[9:14] in the example here.

If the files fit in memory you can define a function that returns the 
key value and use it for the sort.

If lines is a list of strings in the above format,
def myKey(line):
   return float(line[9:14])

lines.sort(key=myKey)

Or you can use John's suggestion of splitting the lines but that may not 
be needed in this case.

Kent


More information about the Tutor mailing list