[Tutor] Text Processing Query
Spyros Charonis
s.charonis at gmail.com
Thu Mar 14 11:56:28 CET 2013
Hello Pythoners,
I am trying to extract certain fields from a file that whose text looks
like this:
COMPND 2 MOLECULE: POTASSIUM CHANNEL SUBFAMILY K MEMBER 4;
COMPND 3 CHAIN: A, B;
COMPND 10 MOL_ID: 2;
COMPND 11 MOLECULE: ANTIBODY FAB FRAGMENT LIGHT CHAIN;
COMPND 12 CHAIN: D, F;
COMPND 13 ENGINEERED: YES;
COMPND 14 MOL_ID: 3;
COMPND 15 MOLECULE: ANTIBODY FAB FRAGMENT HEAVY CHAIN;
COMPND 16 CHAIN: E, G;
I would like the chain IDs, but only those following the text heading
"ANTIBODY FAB FRAGMENT", i.e. I need to create a list with D,F,E,G which
excludes A,B which have a non-antibody text heading. I am using the
following syntax:
with open(filename) as file:
scanfile=file.readlines()
for line in scanfile:
if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: continue
elif line[0:6]=='COMPND' and 'CHAIN' in line:
print line
But this yields:
COMPND 3 CHAIN: A, B;
COMPND 12 CHAIN: D, F;
COMPND 16 CHAIN: E, G;
I would like to ignore the first line since A,B correspond to non-antibody
text headings, and instead want to extract only D,F & E,G whose text
headings are specified as antibody fragments.
Many thanks,
Spyros
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130314/31ff4b38/attachment.html>
More information about the Tutor
mailing list