[Tutor] Logical Structure of Snippet
Spyros Charonis
s.charonis at gmail.com
Mon May 23 22:53:30 CEST 2011
Hello List,
I'm trying to read some sequence files and modify them to a particular
format. These files are structured something like:
>P1; ICA1_HUMAN
AAEVDTG..... (A very long sequence of letters)
>P1;ICA1_BOVIN
TRETG....(A very long sequence of letters)
>P1;ICA2_HUMAN
WKH.....(another sequence)
I read a database file which has information that I need to modify my
sequence files.
I must extract one of the data fields from the database (done this)
and place it in the sequence file (structure shown above). The relevant
database fields go like:
tt; ICA1_HUMAN Description
tt; ICA1_BOVIN Description
tt; ICA2_HUMAN Description
What I would like is to extract the tt; fields (I already have code for
that) and then to read
through the sequence file and insert the TT field corresponding to the >P1
header right underneath
the >P1 header. Basically, I need a newline everytime >P1 occurs in the
sequence file and I need to paste
its corresponding TT field in that newline (for P1; ICA1_HUMAN,that would be
ICA1_HUMAN Description, etc).
the pseudocode would go like this:
for line sequence file:
if line.startswith('>P1; ICA ....)
make a newline
go to list with extracted tt; fields*
find the one with the same query (tt; ICA1 ...)*
insert this field in the newline
The steps marked * are the ones I am not sure how to implement. What
logical structure would I need to make Python match a tt; field (I already
have
the list of entries) whenever it finds a header with the same content?
Apologies for the verbosity, but I did want to be clear as it is quite
specific.
S.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110523/81c21635/attachment.html>
More information about the Tutor
mailing list