remove last 76 letters from string

PeroMHC macmanes at gmail.com
Thu Aug 6 01:54:46 CEST 2009


Hi All, So here is the problem... I have a FASTA file (used for DNA
analyses) that looks like this:

...
>gnl|SRA|SRR019045.10.1 SL-XAY_956090708:2:1:0:1028.1 length=152
NCTTTTTTTATTTTTTGTATAAATGAAGTTTCACTATATCGGACGAGCGGTTCAGCAGTCATTCCGAGAC
CGATATAGTGAAACTTCATTTCTACAAAAANTACCAAACGTCGCTCGGCAGAGCGTCGTGTTGGGCAAGA
GAGTAGCACTCG
>gnl|SRA|SRR019045.11.1 SL-XAY_956090708:2:1:0:1151.1 length=152
NGGTNTGGNNNNCNCCNTNCTNCNNCNTCANCCTCCNGTCNCANNCCNCNTNNNNNCNNNNNCNNTNCTT
CTNCNNTCTCCATTCCTTCTTNATAGCCTGCTCCANCGCACGTTGAACCTTCTGCACCACGAACGCACTC
ACACCACTCATC
>gnl|SRA|SRR019045.12.1 SL-XAY_956090708:2:1:0:1197.1 length=152
NGTCGGGTCTTCGCTATCACTGGACTGCTCCCATCAGCTATAGGTCCTCCCCGCCACACCCCATGCCCAC
CGCCTATCCACGTCTGTCACAACCTCATACATCAGACAGTCACACTTACCAACATATCCAAGCACCTCAA
GCAACACATCAT
...

This snippet represents 3 individual DNA sequences. Each sequences is
identified by the line starting with >
The complete file has about 10 million individual sequences.

A simple enough problem, I want to read in this data, and cut out the
last 76 letters (nucleotides) from each individual sequence and send
them to a new txt file with a similar format.

Any help on how to do this would be appreciated.
Thanks!



More information about the Python-list mailing list