[Tutor] Analysing genetic code (DNA) using python

David Heiser David.Heiser at intelliden.com
Mon Mar 6 22:15:30 CET 2006


Here's one approach to the problem (using bogus codon values).



-----Original Message-----
From: tutor-bounces at python.org [mailto:tutor-bounces at python.org] On
Behalf Of sjw28
Sent: Monday, March 06, 2006 8:37 AM
To: tutor at python.org
Subject: [Tutor] Analysing genetic code (DNA) using python



I have many notepad documents that all contain long chunks of genetic 
code. They look something like this: 

atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgcaaccaaacag 
tacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaaattcgtagaa 
agcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgaccagaacgtacgt 
ggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgtatttacccaa 


Basically, I want to design a program using python that can open and 
read these documents. However, I want them to be read 3 base pairs at a 
time (to analyse them codon by codon) and find the value that each 
codon has a value assigned to it. An example of this is below: 


** If the three base pairs were UUU the value assigned to it (from the 
codon value table) would be 0.296 


The program has to read all the sequence three pairs at a time, then I 
want to get all the values for each codon, multiply them together and 
put them to the power of 1 / the length of the sequence in codons 
(which is the length of the whole sequence divided by three). 


However, to make things even more complicated, the notebook sequences 
are in lowercase and the codon value table is in uppercase, so the 
sequences need to be converted into uppercase. Also, the Ts in the DNA 
sequences need to be changed to Us (again to match the codon value 
table). And finally, before the DNA sequences are read and analysed I 
need to remove the first 50 codons (i.e. the first 150 letters) and the 
last 20 codons (the last 60 letters) from the DNA sequence. I've also 
been having problems ensuring the program reads ALL the sequence 3 
letters at a time. 


I've tried various ways of doing this but keep coming unstuck along the 
way. Has anyone got any suggestions for how they would tackle this 
problem? 
Thanks for any help recieved! 


--
View this message in context:
http://www.nabble.com/Analysing-genetic-code-%28DNA%29-using-python-t123
3856.html#a3263717
Sent from the Python - tutor forum at Nabble.com.

_______________________________________________
Tutor maillist  -  Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GenCode.py
Type: application/octet-stream
Size: 1243 bytes
Desc: GenCode.py
Url : http://mail.python.org/pipermail/tutor/attachments/20060306/5536fbf5/attachment.obj 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenCode.txt
Url: http://mail.python.org/pipermail/tutor/attachments/20060306/5536fbf5/attachment.txt 


More information about the Tutor mailing list