[Tutor] Logical Structure of Snippet

Spyros Charonis s.charonis at gmail.com
Mon May 23 22:53:30 CEST 2011


Hello List,

I'm trying to read some sequence files and modify them to a particular
format. These files are structured something like:

>P1; ICA1_HUMAN
AAEVDTG..... (A very long sequence of letters)
>P1;ICA1_BOVIN
TRETG....(A very long sequence of letters)
>P1;ICA2_HUMAN
WKH.....(another sequence)

I read a database file which has information that I need to modify my
sequence files.
I must extract one of the data fields from the database (done this)
and place it in the sequence file (structure shown above). The relevant
database fields go like:

tt; ICA1_HUMAN       Description
tt; ICA1_BOVIN         Description
tt; ICA2_HUMAN       Description

What I would like is to extract the tt; fields (I already have code for
that) and then to read
through the sequence file and insert the TT field corresponding to the >P1
header right underneath
the >P1 header. Basically, I need a newline everytime >P1 occurs in the
sequence file and I need to paste
its corresponding TT field in that newline (for P1; ICA1_HUMAN,that would be
 ICA1_HUMAN   Description, etc).

the pseudocode would go like this:

for line sequence file:
   if line.startswith('>P1; ICA ....)
       make a newline
       go to list with extracted tt; fields*
       find the one with the same query (tt; ICA1 ...)*
       insert this field in the newline

The steps marked * are the ones I am not sure how to implement. What
logical structure would I need to make Python match a tt; field (I already
have
the list of entries) whenever it finds a header with the same content?

Apologies for the verbosity, but I did want to be clear as it is quite
specific.

S.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110523/81c21635/attachment.html>


More information about the Tutor mailing list