[Tutor] Analysing genetic code (DNA) using python
David Heiser
David.Heiser at intelliden.com
Mon Mar 6 22:15:30 CET 2006
Here's one approach to the problem (using bogus codon values).
-----Original Message-----
From: tutor-bounces at python.org [mailto:tutor-bounces at python.org] On
Behalf Of sjw28
Sent: Monday, March 06, 2006 8:37 AM
To: tutor at python.org
Subject: [Tutor] Analysing genetic code (DNA) using python
I have many notepad documents that all contain long chunks of genetic
code. They look something like this:
atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgcaaccaaacag
tacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaaattcgtagaa
agcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgaccagaacgtacgt
ggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgtatttacccaa
Basically, I want to design a program using python that can open and
read these documents. However, I want them to be read 3 base pairs at a
time (to analyse them codon by codon) and find the value that each
codon has a value assigned to it. An example of this is below:
** If the three base pairs were UUU the value assigned to it (from the
codon value table) would be 0.296
The program has to read all the sequence three pairs at a time, then I
want to get all the values for each codon, multiply them together and
put them to the power of 1 / the length of the sequence in codons
(which is the length of the whole sequence divided by three).
However, to make things even more complicated, the notebook sequences
are in lowercase and the codon value table is in uppercase, so the
sequences need to be converted into uppercase. Also, the Ts in the DNA
sequences need to be changed to Us (again to match the codon value
table). And finally, before the DNA sequences are read and analysed I
need to remove the first 50 codons (i.e. the first 150 letters) and the
last 20 codons (the last 60 letters) from the DNA sequence. I've also
been having problems ensuring the program reads ALL the sequence 3
letters at a time.
I've tried various ways of doing this but keep coming unstuck along the
way. Has anyone got any suggestions for how they would tackle this
problem?
Thanks for any help recieved!
--
View this message in context:
http://www.nabble.com/Analysing-genetic-code-%28DNA%29-using-python-t123
3856.html#a3263717
Sent from the Python - tutor forum at Nabble.com.
_______________________________________________
Tutor maillist - Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GenCode.py
Type: application/octet-stream
Size: 1243 bytes
Desc: GenCode.py
Url : http://mail.python.org/pipermail/tutor/attachments/20060306/5536fbf5/attachment.obj
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenCode.txt
Url: http://mail.python.org/pipermail/tutor/attachments/20060306/5536fbf5/attachment.txt
More information about the Tutor
mailing list