[Fwd: Re: [Tutor] searching for data in one file from another]
melnyk at gmail.com
Mon Nov 8 14:58:11 CET 2004
Thanks for the guidance so far.
The fasta file is such that one line begins with > and contains the
name of the exon, the transcript it is from the gene, etc. The next
line down contains the actual sequence
data (ACCCAGCAAAATGG etc)
I needed to match on the "title line" (beginning with >) line then remove
that line and the following, or just not write into the new file that
line and the following I guess is the more correct way to describe it.
I modified things to:
from sets import Set
f = open(fname2)
f2 = open(exons_to_delete)
for line in open(exons_to_delete):
exon = None
for line in f:
exon = line[1:].split('|')
if exon in sExcise:
if __name__ == '__main__':
for line in deleteExons():
print >> WFILE, line, #write new file minus the redundant exons
Everything seems to be working now. The original fasta file was aprox
85 mb and now is down to 47 mb after the information matching the
excise file was removed.
I am moving on to my next steps now but still interested in comments
on how this could be done more effectively.
Thanks again to all for their input.
On Sat, 06 Nov 2004 01:12:26 -0500, Kent Johnson
<kent_johnson at skillsoft.com> wrote:
> At 12:52 AM 11/6/2004 -0500, Rich Krauter wrote:
> >Thanks very much for the reply. Right on the money and within a few
> >minutes, as usual. I'm starting to think you're an automated help system.
> >The OP should find these suggestions helpful for cleaning up his code.
> Just call me the Kent-bot!
> Tutor maillist - Tutor at python.org
More information about the Tutor