[Numpy-discussion] Question about broadcasting vs for loop performance

Sun Sep 14 22:22:41 EDT 2014

Hello all,

I have a question about the performance of broadcasting versus Python for
loops. I have the following sample code that approximates some simulation
I'd like to do:

## Test Code ##

import numpy as np

def lorentz(x, pos, inten, hwhm):

    return inten*( hwhm**2 / ( (x - pos)**2 + hwhm**2 ) )

poss = np.random.rand(100)

intens = np.random.rand(100)

xs = np.linspace(0,10,10000)

def first_try():

    sim_inten = np.zeros(xs.shape)

    for freq, inten in zip(poss, intens):

        sim_inten += lorentz(xs, freq, inten, 5.0)

    return sim_inten

def second_try():

    sim_inten2 = lorentz(xs.reshape((-1,1)), poss, intens, 5.0)

    sim_inten2 = sim_inten2.sum(axis=1)

    return sim_inten2

print np.array_equal(first_try(), second_try())

## End Test ##

Running this script prints "True" for the final equality test. However,
IPython's %timeit magic, gives ~10 ms for first_try and ~30 ms for
second_try. I tried this on Windows 7 (Anaconda Python) and on a Linux
machine both with Python 2.7 and Numpy 1.8.2.

I understand in principle why broadcasting should be faster than Python
loops, but I'm wondering why I'm getting worse results with the pure Numpy
function. Is there some general rules for when broadcasting might give
worse performance than a Python loop?

Thanks

Ryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140914/6c376410/attachment.html>