Slow Python - what can be done?

Paul McGuire bogus at
Thu Mar 18 20:47:55 CET 2004

"Jason" <sewall93 at> wrote in message
news:480e9240.0403180943.64bee13d at
> Hey,
> I'm an experience programmer but new to Python. I'm doing a simple
> implementation of a field morphing techinique due to Beier and Neely
> (1992) and I have the simple case working in Python 2.3 - but it's
> REALLY slow.
>  <code snipped>
Here's a couple of thoughts:

- Replace the implementation of WarpLine.length() from:
    return sqrt(pow(self.point2.x-self.point1.x, 2) +
pow(self.point2.y-self.point1.y, 2))
    return math.hypot( (self.point2-self.point1).to_tuple() )

  If length() is called repeatedly for a given WarpLine, then cache this
value in an instance variable, and reset it if transformPoint is ever

- Fire up the hotshot profiler.  It will tell you which functions really
need tuning.  Even though you've identified the "slow part", there are
several function calls made from this segment, any of which could be the
real hot spot.

- Look into using psyco.  It is *very* non-intrusive, and you may find you
can get some major speed-up.  The simplest way to get started, after
installing psyco is, at the top of your module, add:
    import pysco
psyco can get in the way of hotshot line profiling.  But you will still get
accurate counts and timing at a function call level.

- Your Point class does a lot of type checking when multiplying or dividing
by a constant.  Is this worth the penalty you are paying every time these
functions get called?  Oh, wait, I see that __mul__ may be multiplying by a
constant, or doing a dot product.  If hotshot tells you this is a choke
point, you might rethink using this one method for both functions.  Or at
least, reduce your type() calls by changing from:
        if type(other) == float or type(other) == int:
            return Point(self.x*other, self.y*other)
            return self.x*other.x + self.y*other.y
        if isinstance(other, Point):
            return self.x*other.x + self.y*other.y
            return Point(self.x*other, self.y*other)

- If you are forced to comparing types instead of using isinstance, I think
'is' would be a bit better than '=='.

- Why does Point.to_tuple() return int's, although you use floats
throughout?  Since it looks like you have set up Point to be immutable, can
you keep both a float and an int member value for x and y, so you don't have
to do extra conversions?  Also, since WarpLine also appears to be immutable,
you should be able to cache diff and diff*diff (maybe call it diff2 or
diffSquared) - either initialize in the __init__ method, or compute lazily
when needed.

- Experience in other languages sometimes gets in your way when using
Python.  Not sure if this has any performance impact, but
Picture.in_bounds() can be made more Pythonic, as (also, double check use of
'<' vs '<='):

    def in_bounds(self,pt):
         return (0 <= pt.x <=[0] and 0 <= pt.y <=[1])

- Another thing to get used to in Python that is unlike other languages is
the return of ordered pairs of values.  Look at this code:
                xy = line1.transformPoint(line2, Point(x,y)).to_tuple()

                if self.in_bounds(Point(xy[0], xy[1])):
                    dest[x + y*[0]] = src[xy[0] +
                    dest[x + y*[0]] = 0

You return a tuple, lets call it (transx, transy), representing the
transformed point, then create a new Point using the separate elements to
test for in_bounds-ness, and then access the individual pieces using [0] and
[1] accesses.  Function calls are performance killers in Python, so avoid
them where possible.  Try this instead:
                transPt = line1.transformPoint(line2, Point(x,y))
                transx, transy = transPt.to_tuple()  # or even     transx,
transy = transPt.x, transPt.y

                if self.in_bounds(transPt):
                    dest[x + y*[0]] = src[transx +
                    dest[x + y*[0]] = 0

- You also reference[0] *many* times, even though this is
invariant across all your nested loops.  Store this in a local variable
before you start your outer for loop, call it something like "xsize".  Then
replace every use of "[0]" with "xsize".  As an added bonus,
your code will start getting more readable.

- Your arithmetic operators can generate *lots* of temporary Point objects.
You can streamline some of this process by defining a __slots__ attribute at
the class level, listing the field names "x" and "y".  This will keep Python
from allocating a dictionary for all possible attribute names and values for
every Point object created.  Likewise for WarpLine.

-- Paul

More information about the Python-list mailing list