how two join and arrange two files together
Chris Rebert
clp2 at rebertia.com
Thu Jul 23 04:16:04 EDT 2009
On Thu, Jul 23, 2009 at 12:22 AM, <amrita at iisermohali.ac.in> wrote:
>
> Hi,
>
> I have two large files:
>
> FileA
> 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C =
> 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35
> 23 ALA H = 8.78 N = CA = HA = C = 179.93.................
>
> and
>
> FileB
> 21 ALA helix (helix_alpha, helix2)
> 23 ALA helix (helix_alpha, helix3)
> 38 ALA helix (helix_alpha, helix3)...........
>
> now what i want that i will make another file in which i will join the two
> file in such a way that only matching entries will come like here 21 and
> 23 ALA is in both files, so the output will be something like:-
>
> 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35| 21 ALA helix
> (helix_alpha, helix2)
> 23 ALA H = 8.78 N = CA = HA = C = 179.93|23 ALA helix (helix_alpha,
> helix3)
>
> and further i will make another file in which i will be able to put those
> lines form this file based on the missing atom value, like for 21 ALA HA
> is not defined so i will put it another file based on its HA missing value
> similarly i will put 23 ALA on another file based on its missing N,CA and
> HA value.
>
> I tried to join the two file based on their matching entries by:---
>>>>from collections import defaultdict
>>>>
>>>> if __name__ == "__main__":
> ... a = open("/home/amrita/alachems/chem100.txt")
> ... c = open("/home/amrita/secstr/secstr100.txt")
> ...
>>>> def source(stream):
> ... return (line.strip() for line in stream)
> ...
> ...
>>>> def merge(sources):
> ... for m in merge([source(a),source(c)]):
> ... print "|".join(c.ljust(10) for c in m)
> ...
>
> but it is not giving any value.
You never actually called any of your <expletive deleted> functions.
Slightly corrected version:
from collections import defaultdict
def source(stream):
return (line.strip() for line in stream)
def merge(sources):
for m in sources:
print "|".join(c.ljust(10) for c in m)
if __name__ == "__main__":
a = open("/home/amrita/alachems/chem100.txt")
c = open("/home/amrita/secstr/secstr100.txt")
merge([source(a), source(c)])
It's still not sophisticated enough to give the exact output you're
looking for, but it is a step in the right direction.
You really should try asking someone from your CS Dept to help you. It
would seriously take a couple hours, at most.
- Chris
--
Still brandishing a cluestick a vain...
http://blog.rebertia.com
More information about the Python-list
mailing list