[Tutor] Text Processing Query

Spyros Charonis s.charonis at gmail.com
Thu Mar 14 17:40:20 CET 2013


Yes, the elif line need to have **flag_FAB ==1** as is conidition instead
of **flag_FAB=1**. So:


for line in scanfile:

    if line[0:6]=='COMPND' and 'FAB' in line: flag_FAB = 1

    elif line[0:6]=='COMPND' and 'CHAIN' in line and flag_FAB == 1:

        print line

        flag_FAB = 0


On Thu, Mar 14, 2013 at 4:33 PM, Mark Lawrence <breamoreboy at yahoo.co.uk>wrote:

> On 14/03/2013 11:28, taserian wrote:
>
> Top posting fixed
>
>
>> On Thu, Mar 14, 2013 at 6:56 AM, Spyros Charonis <s.charonis at gmail.com
>> <mailto:s.charonis at gmail.com>> wrote:
>>
>>     Hello Pythoners,
>>
>>     I am trying to extract certain fields from a file that whose text
>>     looks like this:
>>
>>     COMPND   2 MOLECULE: POTASSIUM CHANNEL SUBFAMILY K MEMBER 4;
>>     COMPND   3 CHAIN: A, B;
>>     COMPND  10 MOL_ID: 2;
>>     COMPND  11 MOLECULE: ANTIBODY FAB FRAGMENT LIGHT CHAIN;
>>     COMPND  12 CHAIN: D, F;
>>     COMPND  13 ENGINEERED: YES;
>>     COMPND  14 MOL_ID: 3;
>>     COMPND  15 MOLECULE: ANTIBODY FAB FRAGMENT HEAVY CHAIN;
>>     COMPND  16 CHAIN: E, G;
>>
>>     I would like the chain IDs, but only those following the text
>>     heading "ANTIBODY FAB FRAGMENT", i.e. I need to create a list with
>>     D,F,E,G  which excludes A,B which have a non-antibody text heading.
>>     I am using the following syntax:
>>
>>     with open(filename) as file:
>>
>>          scanfile=file.readlines()
>>
>>          for line in scanfile:
>>
>>              if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: continue
>>
>>              elif line[0:6]=='COMPND' and 'CHAIN' in line:
>>
>>                  print line
>>
>>
>>     But this yields:
>>
>>     COMPND   3 CHAIN: A, B;
>>     COMPND  12 CHAIN: D, F;
>>     COMPND  16 CHAIN: E, G;
>>
>>     I would like to ignore the first line since A,B correspond to
>>     non-antibody text headings, and instead want to extract only D,F &
>>     E,G whose text headings are specified as antibody fragments.
>>
>>     Many thanks,
>>     Spyros
>>
>> Since the identifier and the item that you want to keep are on different
>> lines, you'll need to set a "flag".
>>
>> with open(filename) as file:
>>
>>      scanfile=file.readlines()
>>
>>      flag = 0
>>
>>      for line in scanfile:
>>
>>          if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: flag = 1
>>
>>          elif line[0:6]=='COMPND' and 'CHAIN' in line and flag = 1:
>>
>>              print line
>>
>>              flag = 0
>>
>>
>> Notice that the flag is set to 1 only on "FAB FRAGMENT", and it's reset
>> to 0 after the next "CHAIN" line that follows the "FAB FRAGMENT" line.
>>
>>
>> AR
>>
>>
>>
> Notice that this code won't run due to a syntax error.
>
> --
> Cheers.
>
> Mark Lawrence
>
>
> ______________________________**_________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130314/5fec186b/attachment.html>


More information about the Tutor mailing list