[Tutor] arrangement of datafile
Peter Otten
__peter__ at web.de
Thu Jan 9 13:41:58 CET 2014
Amrita Kumari wrote:
> On 17th Dec. I posted one question, how to arrange datafile in a
> particular fashion so that I can have only residue no. and chemical
> shift value of the atom as:
> 1 H=nil
> 2 H=8.8500
> 3 H=8.7530
> 4 H=7.9100
> 5 H=7.4450
> ........
> Peter has replied to this mail but since I haven't subscribe to the
> tutor mailing list earlier hence I didn't receive the reply, I
> apologize for my mistake, today I checked his reply and he asked me to
> do few things:
I'm sorry, I'm currently lacking the patience to tune into your problem
again, but maybe the script that I wrote (but did not post) back then is of
help.
The data sample:
$ cat residues.txt
1 GLY HA2=3.7850 HA3=3.9130
2 SER H=8.8500 HA=4.3370 N=115.7570
3 LYS H=8.7530 HA=4.0340 HB2=1.8080 N=123.2380
4 LYS H=7.9100 HA=3.8620 HB2=1.7440 HG2=1.4410 N=117.9810
5 LYS H=7.4450 HA=4.0770 HB2=1.7650 HG2=1.4130 N=115.4790
6 LEU H=7.6870 HA=4.2100 HB2=1.3860 HB3=1.6050 HG=1.5130 HD11=0.7690
HD12=0.7690 HD13=0.7690 N=117.3260
7 PHE H=7.8190 HA=4.5540 HB2=3.1360 N=117.0800
8 PRO HD2=3.7450
9 GLN H=8.2350 HA=4.0120 HB2=2.1370 N=116.3660
10 ILE H=7.9790 HA=3.6970 HB=1.8800 HG21=0.8470 HG22=0.8470 HG23=0.8470
HG12=1.6010 HG13=2.1670 N=119.0300
11 ASN H=7.9470 HA=4.3690 HB3=2.5140 N=117.8620
12 PHE H=8.1910 HA=4.1920 HB2=3.1560 N=121.2640
13 LEU H=8.1330 HA=3.8170 HB3=1.7880 HG=1.5810 HD11=0.8620 HD12=0.8620
HD13=0.8620 N=119.1360
The script:
$ cat residues.py
def process(filename):
residues = {}
with open(filename) as infile:
for line in infile:
parts = line.split() # split line at whitespace
residue = int(parts.pop(0)) # convert first item to integer
if residue in residues:
raise ValueError("duplicate residue {}".format(residue))
parts.pop(0) # discard second item
# split remaining items at "=" and put them in a dict,
# e. g. {"HA2": 3.7, "HA3": 3.9}
pairs = (pair.split("=") for pair in parts)
lookup = {atom: float(value) for atom, value in pairs}
# put previous lookup dict in residues dict
# e. g. {1: {"HA2": 3.7, "HA3": 3.9}}
residues[residue] = lookup
return residues
def show(residues):
atoms = set().union(*(r.keys() for r in residues.values()))
residues = sorted(residues.items())
for atom in sorted(atoms):
for residue, lookup in residues:
print "{} {}={}".format(residue, atom, lookup.get(atom, "nil"))
print
print "-----------"
print
if __name__ == "__main__":
r = process("residues.txt")
show(r)
Note that converting the values to float can be omitted if all you want to
do is print them. Finally the output of the script:
$ python residues.py
1 H=nil
2 H=8.85
3 H=8.753
4 H=7.91
5 H=7.445
6 H=7.687
7 H=7.819
8 H=nil
9 H=8.235
10 H=7.979
11 H=7.947
12 H=8.191
13 H=8.133
-----------
1 HA=nil
2 HA=4.337
3 HA=4.034
4 HA=3.862
5 HA=4.077
6 HA=4.21
7 HA=4.554
8 HA=nil
9 HA=4.012
10 HA=3.697
11 HA=4.369
12 HA=4.192
13 HA=3.817
-----------
[snip]
More information about the Tutor
mailing list