[Tutor] Need advise on protein sequence alignment code in python 2.7 giving wrong output

Laura Creighton lac at openend.se
Tue Aug 11 20:54:05 CEST 2015


You posted incomplete code -- at any rate I cannot get it to work.

However, I think the problem is here:

  for i in xrange(len(v)):
       for j in xrange(len(w)):
               if max(S[i][j-1],S[i-1][j],S[i-1][j-1]) == S[i-1][j]:

When j is 0, j-1 refers to the end of the list, which makes no sense.

I think you want:

for i in xrange(1, len(v)+1):
     for j in xrange(1, len(w)+1):
             if max(S[i][j-1],S[i-1][j],S[i-1][j-1]) == S[i-1][j]:

But I could be wrong about this, since I could never get the code to work.

The other question is, should you be writing your own code to do this
at all?

In trying to figure out what it was you want to do, I downloaded
https://pypi.python.org/pypi/nwalign

and I got:

>>> nwalign.global_align("PLEASANTLY", "MEANLY", gap_open=-5, gap_extend=-5, ma
trix='./BLOSUM62.txt')
('PLEASANTLY', '-MEA--N-LY')

Which seems to be the answer you are looking for.

The original code for this is at
https://bitbucket.org/brentp/biostuff/src/282b504ac9020fe1449e23f800b20b5bd7d12061/nwalign/pairwise.py?at=default .

though it writes to Cython for reasons of speed.

The bitbucket code says:
    for i in range(1, max_i + 1):
            ci = seqi[i - 1]
	    for j in range(1, max_j + 1):
	    	cj = seqj[j - 1]

Which is why I think that what I wrote will fix your code, but again, I
never got it to work so all of this is untested.

Laura




More information about the Tutor mailing list