sorting 1172026 entries

Stefan Behnel stefan_ml at behnel.de
Sun May 6 13:15:07 EDT 2012


J. Mwebaze, 06.05.2012 18:29:
> sorry see, corrected code
> 
> for filename in txtfiles:
>    temp=[]
>    f=open(filename)
>    for line in f.readlines():
>      line = line.strip()
>      line=line.split()
>      temp.append((parser.parse(line[0]), float(line[1])))
>    temp=sorted(temp)
>    with open(filename.strip('.txt')+ '.sorted', 'wb') as p:
>         for i, j in temp:
>            p.write('%s %s\n' %(str(i),j))
> 
>> I have attached one of the files, try to sort and let me know the results.
>>  Kindly sort by date. ooops - am told the file exceed 25M.
>>
>> below is the code
>>
>> import glob
>> txtfiles =glob.glob('*.txt')
>> import dateutil.parser as parser
>>
>>
>> for filename in txtfiles:
>>    temp=[]
>>    f=open(filename)
>>    for line in f.readlines():
>>      line = line.strip()
>>      line=line.split()
>>      temp.append((parser.parse(line[0]), float(line[1])))
>>    temp=sorted(temp)
>>    with open(filename.strip('.txt')+ '.sorted', 'wb') as p:
>>         for i, j in temp:
>>            p.write('%s %s\n' %(str(i),j))

How much memory do you have on your system? Does the list fit into memory
easily or is it swapping to disk while you are running the sort?

Stefan




More information about the Python-list mailing list