concatenate fasta file

Jean-Michel Pichavant jeanmichel at sequans.com
Fri Feb 12 17:49:15 CET 2010


PeroMHC wrote:
> Hi All, I have  a simple problem that I hope somebody can help with. I
> have an input file (a fasta file) that I need to edit..
>
> Input file format
>
>   
>> name 1
>>     
> tactcatacatac
>   
>> name 2
>>     
> acggtggcat
>   
>> name 3
>>     
> gggtaccacgtt
>
> I need to concatenate the sequences.. make them look like
>
>   
>> concatenated
>>     
> tactcatacatacacggtggcatgggtaccacgtt
>
> thanks. Matt
>   
A solution using regexp:

found = []
for line in open('seqfile.txt'):
    found += re.findall('^[acgtACGT]+$', line)

print found
 > ['tactcatacatac', 'acggtggcat', 'gggtaccacgtt']

print ''.join(found)
 > 'tactcatacatacacggtggcatgggtaccacgtt'


JM



More information about the Python-list mailing list