[Numpy-discussion] Comparing NumPy/IDL Performance

Zachary Pincus zachary.pincus at yale.edu
Mon Sep 26 11:24:01 EDT 2011

Hello Keith,

While I also echo Johann's points about the arbitrariness and non-utility of benchmarking I'll briefly comment just on just a few tests to help out with getting things into idiomatic python/numpy:

Tests 1 and 2 are fairly pointless (empty for loop and empty procedure) that won't actually influence the running time of well-written non-pathological code.

Test 3: 
    #Test 3 - Add 200000 scalar ints
    nrep = 2000000 * scale_factor
    for i in range(nrep):
        a = i + 1

well, python looping is slow... one doesn't do such loops in idiomatic code if the underlying intent can be re-cast into array operations in numpy. But here the test is on such a simple operation that it's not clear how to recast in a way that would remain reasonable. Ideally you'd test something like:
i = numpy.arange(200000)
for j in range(scale_factor):
  a = i + 1

but that sort of changes what the test is testing.

Finally, test 21:
    #Test 21 - Smooth 512 by 512 byte array, 5x5 boxcar
    for i in range(nrep):
        b = scipy.ndimage.filters.median_filter(a, size=(5, 5))
    timer.log('Smooth 512 by 512 byte array, 5x5 boxcar, %d times' % nrep)

A median filter is definitely NOT a boxcar filter! You want "uniform_filter":

In [4]: a = numpy.empty((1000,1000))

In [5]: timeit scipy.ndimage.filters.median_filter(a, size=(5, 5))
10 loops, best of 3: 93.2 ms per loop

In [6]: timeit scipy.ndimage.filters.uniform_filter(a, size=(5, 5))
10 loops, best of 3: 27.7 ms per loop


On Sep 26, 2011, at 10:19 AM, Keith Hughitt wrote:

> Hi all,
> Myself and several colleagues have recently started work on a Python library for solar physics, in order to provide an alternative to the current mainstay for solar physics, which is written in IDL.
> One of the first steps we have taken is to create a Python port of a popular benchmark for IDL (time_test3) which measures performance for a variety of (primarily matrix) operations. In our initial attempt, however, Python performs significantly poorer than IDL for several of the tests. I have attached a graph which shows the results for one machine: the x-axis is the test # being compared, and the y-axis is the time it took to complete the test, in milliseconds. While it is possible that this is simply due to limitations in Python/Numpy, I suspect that this is due at least in part to our lack in familiarity with NumPy and SciPy.
> So my question is, does anyone see any places where we are doing things very inefficiently in Python?
> In order to try and ensure a fair comparison between IDL and Python there are some things (e.g. the style of timing and output) which we have deliberately chosen to do a certain way. In other cases, however, it is likely that we just didn't know a better method.
> Any feedback or suggestions people have would be greatly appreciated. Unfortunately, due to the proprietary nature of IDL, we cannot share the original version of time_test3, but hopefully the comments in time_test3.py will be clear enough.
> Thanks!
> Keith
> <sunpy_time_test3_idl_python_2011-09-26.png>_______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list