Real-world Python code 700 times slower than C

Kragen Sitaker kragen at pobox.com
Wed Jan 23 18:45:27 EST 2002


In an article no longer on my news server, Fernando Perez wrote:
    A quick rewrite in Numeric gives me about a 5x speedup, but
    there's still a nasty bottleneck: the malloc() call implicit in
    every call to RampNum:

    def RampNum(result, size, start, end):
	step = (end-start)/(size-1)
	result[:] = arange(size)*step + start

    There's no easy way to do (that I know of) the in-place operation
    in Numeric, a very annoying limitation. Numeric will always
    compute a new array on the right hand side, unfortunately (with
    the associated allocation).

Well, there are actually three allocations going on there:
- one for arange(size)
- one for the multiplication
- one for the addition

I think you can reduce this to one with the following untested code in
Python 2.x:
    result[:] = arange(size)
    result *= step
    result += start

You can also say:
    result[:] = arange(size)
    Numeric.multiply(result, step, result)
    Numeric.add(result, start, result)

All the binary ufuncs defined in Numeric have a three-argument form in
which the third argument specifies where to put the result.  This is
very helpful when you're trying to speed up inner loops of Numpy
programs.




More information about the Python-list mailing list