[Numpy-discussion] odd performance of sum?

Charles R Harris charlesr.harris at gmail.com
Thu Feb 10 17:10:41 EST 2011


On Thu, Feb 10, 2011 at 2:26 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> On Thu, Feb 10, 2011 at 10:31 AM, Pauli Virtanen <pav at iki.fi> wrote:
>
>> Thu, 10 Feb 2011 12:16:12 -0600, Robert Kern wrote:
>> [clip]
>> > One thing that might be worthwhile is to make
>> > implementations of sum() and cumsum() that avoid the ufunc machinery and
>> > do their iterations more quickly, at least for some common combinations
>> > of dtype and contiguity.
>>
>> I wonder what is the balance between the iterator overhead and the time
>> taken in the reduction inner loop. This should be straightforward to
>> benchmark.
>>
>> Apparently, some overhead decreased with the new iterators, since current
>> Numpy master outperforms 1.5.1 by a factor of 2 for this benchmark:
>>
>> In [8]: %timeit M.sum(1)     # Numpy 1.5.1
>> 10 loops, best of 3: 85 ms per loop
>>
>> In [8]: %timeit M.sum(1)     # Numpy master
>> 10 loops, best of 3: 49.5 ms per loop
>>
>> I don't think this is explainable by the new memory layout optimizations,
>> since M is C-contiguous.
>>
>> Perhaps there would be room for more optimization, even within the ufunc
>> framework?
>>
>
> I played around with this in einsum, where it's a bit easier to specialize
> this case than in the ufunc machinery. What I found made the biggest
> difference is to use SSE prefetching instructions to prepare the cache in
> advance. Here are the kind of numbers I get, all from the current Numpy
> master:
>
> In [7]: timeit M.sum(1)
> 10 loops, best of 3: 44.6 ms per loop
>
> In [8]: timeit dot(M, o)
> 10 loops, best of 3: 36.8 ms per loop
>
> In [9]: timeit einsum('ij->i', M)
> 10 loops, best of 3: 32.1 ms per loop
> ...
>

I get an even bigger speedup:

In [5]: timeit M.sum(1)
10 loops, best of 3: 19.2 ms per loop

In [6]: timeit dot(M, o)
100 loops, best of 3: 15.2 ms per loop

In [7]: timeit einsum('ij->i', M)
100 loops, best of 3: 11.4 ms per loop

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110210/09790cea/attachment.html>


More information about the NumPy-Discussion mailing list