[Tutor] how to sort the file out
lina
lina.lastname at gmail.com
Wed Sep 7 12:04:30 CEST 2011
On Wed, Sep 7, 2011 at 4:52 PM, Peter Otten <__peter__ at web.de> wrote:
> lina wrote:
>
>> HI, I have two files, one is reference file, another is waiting for adjust
>> one,
>>
>> File 1:
>>
>> 1 C1
>> 2 O1
> [...]
>> 33 C19
>> 34 O5
>> 35 C21
>>
>> File 2:
>> 3 H16
>> 4 H5
> [...]
>> 39 H62
>> 40 O2
>> 41 H22
>>
>> I wish the field 2 from file 2 arranged the same sequence as the field
>> 2 of file 1.
>>
>> Thanks for any suggestions,
>>
>> I drove my minds into nuts already, three hours passed and I still
>> failed to achieve this.
>
> You could have written the above after three minutes. To get the most out of
> this mailing list you should give some details of what you tried and how it
> failed. This gives us valuable information about your level of knowledge and
> confidence that you are trying to learn rather than get solutions on the
> cheap.
>
> However, I'm in the mood for some spoonfeeding:
LOL ... thanks.
I am very very low leavel in python, many times I just gave up due to
frustration in using it. and escape back to bash, awk.
>
> indexfile = "tmp_index.txt"
> datafile = "tmp_data.txt"
> sorteddatafile = "tmp_output.txt"
>
> def make_lookup(lines):
> r"""Build a dictionary that maps the second column to the line number.
>
> >>> make_lookup(["aaa bbb\n", "ccc ddd\n"]) == {'bbb': 0, 'ddd': 1}
> True
> """
> position_lookup = {}
> for lineno, line in enumerate(lines):
> second_field = line.split()[1]
> position_lookup[second_field] = lineno
> return position_lookup
>
> with open(indexfile) as f:
> position_lookup = make_lookup(f)
>
> # With your sample data the global position_lookup dict looks like this now:
> # {'C1': 0, 'O1': 1, 'C2': 2,... , 'O5': 33, 'C21': 34}
>
> def get_position(line):
> r"""Extract the second field from the line and look up the
> associated line number in the global position_lookup dictionary.
>
> Example:
> get_position("15 C2\n")
> The line is split into ["15", "C2"]
> The second field is "C2"
> Its associated line number in position_lookup: 2
> --> the function returns 2
> """
> second_field = line.split()[1]
> return position_lookup[second_field]
>
> with open(datafile) as f:
> # sort the lines in the data file using the line number in the index
> # file as the sort key
> lines = sorted(f, key=get_position)
>
> with open(sorteddatafile, "w") as f:
> f.writelines(lines)
>
It's an amazing opportunity to learn. I will try it now.
>
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
--
Best Regards,
lina
More information about the Tutor
mailing list