[Tutor] number of mismatches in a string
Jerry Hill
malaclypse2 at gmail.com
Fri Mar 2 23:00:50 CET 2012
On Fri, Mar 2, 2012 at 2:11 PM, Hs Hs <ilhs_hs at yahoo.com> wrote:
> Hi:
> I have the following table and I am interested in calculating mismatch
> ratio. I am not completely clear how to do this and any help is deeply
> appreciated.
>
> Length Matches
> 77 24A0T9T36
> 71 25^T9^T37
> 60 25^T9^T26
> 62 42A19
>
>
> In length column I have length of the character string.
> In the second column I have the matches my reference string.
>
>
> In fist case, where 77 is length, in matches from left to right, first 24
> matched my reference string following by a extra character A, a null (does
> not account to proble) and extra T, 9 matches, extra T and 36 matches.
> Totally there are 3 mismatches
>
> In case 2, I lost 2 characters (^ = loss of character compared to reference
> sentence) -
>
> TOMISAGOODBOY
> T^MISAGOOD^OY (here I lost 2 characters) = I have 2 mismatches
> TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two
> mismatches
>
>
> In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch
>
>
> How can that mismatch number from matches string.
> 1. I have to count how many A or T or G or C (believe me only these 4
> letters will appear in this, i will not see Z or B or K etc)
> 2. ^T or ^A or ^G or ^C will also be a mismatch
>
>
> desired output:
>
> Length Matches mismatches
> 77 24A0T9T36 3
> 71 25^T9^T37 2
> 60 25^T9^T26 2
> 62 42A19 1
> 10 6^TTT1 3
>
It looks like all you need to do is count the number of A, T, C, and G
characters in your Matches column. Maybe something like this:
differences = [
[77, '24A0T9T36'],
[71, '25^T9^T37'],
[60, '25^T9^T26'],
[62, '42A19']
]
for length, matches in differences:
mismatches = 0
for char in matches:
if char in ('A', 'T', 'G', 'C'):
mismatches += 1
print length, matches, mismatches
which produces the following output:
77 24A0T9T36 3
71 25^T9^T37 2
60 25^T9^T26 2
62 42A19 1
--
Jerry
More information about the Tutor
mailing list