[Tutor] number of mismatches in a string

Albert-Jan Roskam fomcl at yahoo.com
Fri Mar 2 22:14:26 CET 2012

```Hi,

Another source of inspiration could be the levenshtein distance.

Regards,
Albert-Jan

> From: Hs Hs <ilhs_hs at yahoo.com>
>To: "tutor at python.org" <tutor at python.org>
>Sent: Friday, March 2, 2012 8:11 PM
>Subject: [Tutor] number of mismatches in a string
>Hi:
>I have the following table and I am interested in calculating mismatch ratio. I am not completely clear how to do this and any help is deeply appreciated.
>Length     Matches
>77      24A0T9T36
>71      25^T9^T37
>60      25^T9^T26
>62      42A19
>In length column I have length of the character string.
>In the second column I have the matches my reference string.
>In fist case, where 77 is length, in matches from left to right, first 24 matched my reference string following by a extra character A, a null (does not account to proble) and extra T, 9 matches, extra T and 36 matches.  Totally there are 3 mismatches
>In case 2, I lost 2 characters (^ = loss of character compared to reference sentence)   -
>TOMISAGOODBOY
>T^MISAGOOD^OY   (here I lost 2 characters)  = I have 2 mismatches
>TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two mismatches
>In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch
>How can that mismatch number from matches string.
>1. I have to count how many A or T or G or C (believe me only these 4 letters will appear in this, i will not see Z or B or K etc)
>2. ^T or ^A or ^G or ^C will also be a mismatch
>desired output:
>Length     Matches   mismatches
>77      24A0T9T36    3
>71      25^T9^T37     2
>60      25^T9^T26     2
>62      42A19             1
>10      6^TTT1           3
>thanks
>Hs.
```