[Numpy-discussion] Sort performance with structured array

Tom Aldcroft aldcroft at head.cfa.harvard.edu
Sun Apr 7 19:23:55 EDT 2013


I'm seeing about a factor of 50 difference in performance between
sorting a random integer array versus sorting that same array viewed
as a structured array.  Am I doing anything wrong here?

In [2]: x = np.random.randint(10000, size=10000)

In [3]: xarr = x.view(dtype=[('a', np.int)])

In [4]: timeit np.sort(x)
1000 loops, best of 3: 588 us per loop

In [5]: timeit np.sort(xarr)
10 loops, best of 3: 29 ms per loop

In [6]: timeit np.sort(xarr, order=('a',))
10 loops, best of 3: 28.9 ms per loop

I was wondering if this slowdown is expected (maybe the comparison is
dropping back to pure Python or ??).  I'm showing a simple example
here, but in reality I'm working with non-trivial structured arrays
where I might want to sort on multiple columns.

Does anyone have suggestions for speeding things up, or have a sort
implementation (perhaps Cython) that has better performance for
structured arrays?

Thanks,
Tom



More information about the NumPy-Discussion mailing list