[Tutor] sorting question
Kent Johnson
kent37 at tds.net
Wed Mar 7 02:17:38 CET 2007
Rob Andrews wrote:
> I'm trying to think of the best way to go about this one, as the files
> I have to sort are *big*.
>
> They're ASCII files with each row consisting of a series of
> fixed-length fields, each of which has a corresponding format file.
> (To be specific, these files are FirstLogic compatible.)
>
> I'm looking to sort files such that I can produce the 50,000 records
> with the highest "score" in a certain field.
>
> A grossly over-simplified example is:
>
> "JohnDoe 3.14123 Anywhere St."
> "MarySmith11.03One Jackson Pl. "
>
> ------------------------------------------------------------
>>>> for x in people: # substituting 'people' for a file of records
> print x[9:14]
>
> 3.14
> 11.03
> ------------------------------------------------------------
>
> With this in mind, I'm trying to sort the file by the value of the
> number in the field represented by x[9:14] in the example here.
If the files fit in memory you can define a function that returns the
key value and use it for the sort.
If lines is a list of strings in the above format,
def myKey(line):
return float(line[9:14])
lines.sort(key=myKey)
Or you can use John's suggestion of splitting the lines but that may not
be needed in this case.
Kent
More information about the Tutor
mailing list