how to speedup this code?

Alexander Schmolck a.schmolck at gmx.net
Fri Jan 9 12:41:29 EST 2004


Ognen Duzlevski <maketo at gronland.freeshell.org> writes:

> Hi all,
> 
> I have rewritten a C program to solve a bioinformatics problem. Portion where most of the time is spent is:
> 
> def DynAlign(scoremat,insertmat,delmat,tseq,qseq,tlen,qlen):
>     global ONEINDELPENALTY,OPENGAPPENALTY
> 
>     for ndx1 in range(1,tlen+1):
>         for ndx2 in range(1,qlen+1):
>             delmat[ndx1][ndx2] = Min(delmat  [ndx1-1][ndx2]+ONEINDELPENALTY, \
>                                  Min(scoremat[ndx1-1][ndx2]+OPENGAPPENALTY+ONEINDELPENALTY, \
>                                     insertmat[ndx1-1][ndx2]+OPENGAPPENALTY+ONEINDELPENALTY))
>             insertmat[ndx1][ndx2] = Min(insertmat[ndx1][ndx2-1]+ONEINDELPENALTY, \
>                             Min(scoremat[ndx1][ndx2-1]+OPENGAPPENALTY+ONEINDELPENALTY, \
>                             delmat[ndx1][ndx2-1]+OPENGAPPENALTY+ONEINDELPENALTY))
>             scoremat[ndx1][ndx2] = Min(scoremat[ndx1-1][ndx2-1], \
>                             Min(delmat[ndx1-1][ndx2-1], insertmat[ndx1-1][ndx2-1])) + \
>                             GetFitnessScore(tseq,ndx1-1,qseq,ndx2-1)

You're obviously quite new to python and Numeric (also have a look at
numarray, BTW, which is meant to replace Numeric in the not so far future), so
you're still thinking in terms of C.

In python+Numeric you can often replace C-style for loops with a much more
concise and elegant array manipulation (which will be very efficient, unlike
for loops in python -- unless you are using psyco, that is).

I think you're particular example will fit into this pattern (warning: the
code below is just toget you started -- it might be quite wrong, I haven't
looked too carefullly at your code and just jotted this down quickly):

    from Numeric import minimum, array, ...
    delmat = array(
    [...]
    def ...

        # formatted a bit funny for better visual clarity
        delmat[1:tlen+1,1:qlen+1] = \
            minimum(   delmat[1:tlen,1:qlen+1]+ ONEINDELPENALTY,
            minimum( scoremat[1:tlen,1:qlen+1]+ OPENGAPPENALTY, 
                    insertmat[1:tlen,1:qlen+1]+ OPENGAPPENALTY + ONEINDELPENALTY))
        [etc.]

Be sure you understand the indexing (NB. array[a][b] vs. array[a,b]), the
Numeric manual is quite good, so have a look at it (reading some code in scipy
or some other project that uses Numeric might also be a good way to speed up
the learning process).


HTH

'as

p.s. Let me also recommend that you don't emulate the C style
compile-run-debug-edit approach in python -- I think you'll find that a
test-driven development + an interactive shell to try things out (have a look
at ipython, BTW) make for much more pleasant scientific computing.



More information about the Python-list mailing list