Sort Big File Help

MRAB python at mrabarnett.plus.com
Wed Mar 3 14:59:22 EST 2010


mk wrote:
> John Filben wrote:
>> I am new to Python but have used many other (mostly dead) languages in 
>> the past.  I want to be able to process *.txt and *.csv files.  I can 
>> now read that and then change them as needed – mostly just take a 
>> column and do some if-then to create a new variable.  My problem is 
>> sorting these files:
>>
>> 1.)    How do I sort file1.txt by position and write out 
>> file1_sorted.txt; for example, if all the records are 100 bytes long 
>> and there is a three digit id in the position 0-2; here would be some 
>> sample data:
>>
>> a.       001JohnFilben……
>>
>> b.      002Joe  Smith…..
> 
> Use a dictionary:
> 
> linedict = {}
> for line in f:
>     key = line[:3]
>     linedict[key] = line[3:] # or alternatively 'line' if you want to 
> include key in the line anyway
> 
> sortedlines = []
> for key in linedict.keys().sort():
>     sortedlines.append(linedict[key])
> 
> (untested)
> 
> This is the simplest, and probably inefficient approach. But it should 
> work.
> 
[snip]
Simpler would be:

     lines = f.readlines()
     lines.sort(key=lambda line: line[ : 3])

or even:

     lines = sorted(f.readlines(), key=lambda line: line[ : 3]))




More information about the Python-list mailing list