[Tutor] Taking FASTA file as an input in Python 3
Mihir Kharate
kharatemihir at gmail.com
Sun Oct 20 13:00:53 EDT 2019
Hello,
I want my python program to ask for an input that accepts the FASTA files.
FASTA files are a type of text files that we use in bioinformatics. The
first line in a FASTA file is a description about the gene it is encoding.
The data starts with the second line. An example of the fasta format would
be:
>NC_003423.3:c429013-426160 Schizosaccharomyces pombe chromosome II, complete sequence
ATGGAAAAAATAAAACTTTTAAATGTAAAAACTCCCAATCATTATACTATTATTTTCAAGGTGGTGGCAT
ACTACAGCGCACTTCAACCTAACCAAAACGAACTACGAAAAGTACGAATGCTTGCTGCTGAAAGTTCTAA
TGTTAATGGATTATTTAAATCAGTAGTTGCTGTTTTAGATTGTGATGATGAAACGGTACTATTTTGAATT
ATCAATTGGGTTTGCTGACTTTGTTTACCTAGAAAGAATTGTTCATTAAAAATGACGGGAAAGCTTTGAG
TTTTCCGTATGACTGGAAGCTGGCAACTCATGTTATATGCGATGACTTTTCCTCTCCTAATGTACAAGAA
I found the following code online and tried to print it to see whether the
first line is overread:
> DNA_sequence = open ("sequence.fasta" , "r")
> DNA_sequence.readline()
> print ("DNA_sequence")
However, this prints the following statement;
> <_io.TextIOWrapper name='sequence.fasta' mode='r' encoding='cp1252'>
What I am interested in the fasta file is the DNA code, which starts with
the second line. I want to be able to use this code as if a it is a string
(So that it could be used with attributes like maketrans,etc. which I have
in my code)
Also, it would be easier to be able to input a fasta file just by dragging
and dropping it into the shell. Any suggestions?
Thanks,
~Mihir
More information about the Tutor
mailing list