Sort Big File Help
MRAB
python at mrabarnett.plus.com
Wed Mar 3 14:59:22 EST 2010
mk wrote:
> John Filben wrote:
>> I am new to Python but have used many other (mostly dead) languages in
>> the past. I want to be able to process *.txt and *.csv files. I can
>> now read that and then change them as needed – mostly just take a
>> column and do some if-then to create a new variable. My problem is
>> sorting these files:
>>
>> 1.) How do I sort file1.txt by position and write out
>> file1_sorted.txt; for example, if all the records are 100 bytes long
>> and there is a three digit id in the position 0-2; here would be some
>> sample data:
>>
>> a. 001JohnFilben……
>>
>> b. 002Joe Smith…..
>
> Use a dictionary:
>
> linedict = {}
> for line in f:
> key = line[:3]
> linedict[key] = line[3:] # or alternatively 'line' if you want to
> include key in the line anyway
>
> sortedlines = []
> for key in linedict.keys().sort():
> sortedlines.append(linedict[key])
>
> (untested)
>
> This is the simplest, and probably inefficient approach. But it should
> work.
>
[snip]
Simpler would be:
lines = f.readlines()
lines.sort(key=lambda line: line[ : 3])
or even:
lines = sorted(f.readlines(), key=lambda line: line[ : 3]))
More information about the Python-list
mailing list