
On Sun, May 16, 2010 at 1:18 PM, Eric Firing <efiring@hawaii.edu> wrote:
On 05/16/2010 09:24 AM, Keith Goodman wrote:
On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna <lasagnadavide@gmail.com> wrote:
Hi all, What is the fastest and lowest memory consumption way to compute this? y = np.arange(2**24) bases = y[1:] + y[:-1] Actually it is already quite fast, but i'm not sure whether it is occupying some temporary memory is the summation. Any help is appreciated.
Is it OK to modify y? If so:
y = np.arange(2**24) z = y[1:] + y[:-1] #<--- Slow way y[:-1] += y[1:] #<--- Fast way (y[:-1] == z).all() True
It's not faster on my machine, as timed with ipython:
In [8]:y = np.arange(2**24)
In [9]:b = np.array([1,1], dtype=int)
In [10]:timeit np.convolve(y, b, 'valid') 1 loops, best of 3: 484 ms per loop
In [11]:timeit y[1:] + y[:-1] 10 loops, best of 3: 181 ms per loop
In [12]:timeit y[:-1] += y[1:] 10 loops, best of 3: 183 ms per loop
If we include the fake data generation in the timing, to reduce cache bias in the repeated runs, the += method is noticeably slower.
In [13]:timeit y = np.arange(2**24); z = y[1:] + y[:-1] 1 loops, best of 3: 297 ms per loop
In [14]:timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1] 1 loops, best of 3: 322 ms per loop
That's interesting. On my computer it is faster:
timeit y = np.arange(2**24); z = y[1:] + y[:-1] 10 loops, best of 3: 144 ms per loop timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1] 10 loops, best of 3: 114 ms per loop
What accounts for the performance difference? Cache size? I assume the in-place version uses less memory. Neat if timeit reported memory usage. I haven't tried numexp, that might be something to try too.