[Tutor] second if

Peter Otten __peter__ at web.de
Mon Feb 10 12:24:58 CET 2014


rahmad akbar wrote:

> he guys, i am trying to understand this code: i understand the first if
> statement (if line.startswith..) in read_fasta function but couldnt
> understand the next one(if index >=...). thanks in advance!!

Every time a line starts with a ">" sign the current Fasta instance is 
stored in the items list and new Fasta instance is created. 

But the first time a line starts with ">" there is not yet a "current" 
instance which can be appended to the list. The index variable keeps track 
of the number of Fasta instances created, so the first time around the index 
is 0 and the

if index >= 1:
   items.append(aninstance)

suite is not executed. 

By the way, can you figure out what happens if the file contains no lines at 
all? How would you need to change the script to avoid raising an exception 
in this case?
 
> import sys
> #class declaration with both attributes we need
> class Fasta:
>     def __init__(self, name, sequence):
>         #this will store the sequence name
>         self.name = name
>         #this  will store the sequence itself
>         self.sequence = sequence
> 
> #this function will receive the list with the file
> #contents, create instances of the Fasta class as
> #it scans the list, putting the sequence name on the
> #first attribute and the sequence itself on the second
> #attribute
> def read_fasta(file):
>     #we declare an empty list that will store all
>     #Fasta class instances generated
>     items = []
>     index = 0
>     for line in file:
>     #we check to see if the line starts with a > sign
>         if line.startswith(">"):
>            #if so and our counter is large than 1
>            #we add the created class instance to our list
>            #a counter larger than 1 means we are reading
>            #from sequences 2 and above
>            if index >= 1:
>                items.append(aninstance)
>            index+=1
>            #we add the line contents to a string
>            name = line[:-1]
>            #and initialize the string to store the sequence
>            seq = ''
>            #this creates a class instance and we add the attributes
>            #which are the strings name and seq
>            aninstance = Fasta(name, seq)
>         else:
>            #the line does not start with > so it has to be
>            #a sequence line, so we increment the string and
>            #add it to the created instance
>             seq += line[:-1]
>             aninstance = Fasta(name, seq)
> 
>     #the loop before reads everything but the penultimate
>     #sequence is added at the end, so we need to add it
>     #after the loop ends
>     items.append(aninstance)
>     #a list with all read sequences is returned
>     return items
> 
> fastafile = open(sys.argv[1], 'r').readlines()
> mysequences = read_fasta(fastafile)
> 
> print mysequences
> 
> for i in mysequences:
>     print i.name
> 




More information about the Tutor mailing list