[Numpy-discussion] Rolling window (moving average, moving std, and more)

Keith Goodman kwgoodman at gmail.com
Tue Jan 4 11:14:58 EST 2011


On Tue, Jan 4, 2011 at 8:06 AM, Sebastian Haase <seb.haase at gmail.com> wrote:
> On Mon, Jan 3, 2011 at 5:32 PM, Erik Rigtorp <erik at rigtorp.com> wrote:
>> On Mon, Jan 3, 2011 at 11:26, Eric Firing <efiring at hawaii.edu> wrote:
>>> Instead of calculating statistics independently each time the window is
>>> advanced one data point, the statistics are updated.  I have not done
>>> any benchmarking, but I expect this approach to be quick.
>>
>> This might accumulate numerical errors. But could be fine for many applications.
>>
>>> The code is old; I have not tried to update it to take advantage of
>>> cython's advances over pyrex.  If I were writing it now, I might not
>>> bother with the C level at all; it could all be done in cython, probably
>>> with no speed penalty, and maybe even with reduced overhead.
>>>
>>
>> No doubt this would be faster, I just wanted to offer a general way to
>> this in NumPy.
>> _______________________________________________
>
> BTW, some of these operations can be done using scipy's ndimage  - right ?
> Any comments ?  How does the performance compare ?
> ndimage might have more options regarding edge handling, or ?

Take a look at the moving window function in the development version
of the la package:

https://github.com/kwgoodman/la/blob/master/la/farray/mov.py

Many of the moving window functions offer three calculation methods:
filter (ndimage), strides (the strides trick discussed in this
thread), and loop (a simple python loop).

For example:

>> a = np.random.rand(500,2000)
>> timeit la.farray.mov_max(a, window=252, axis=-1, method='filter')
1 loops, best of 3: 336 ms per loop
>> timeit la.farray.mov_max(a, window=252, axis=-1, method='strides')
1 loops, best of 3: 609 ms per loop
>> timeit la.farray.mov_max(a, window=252, axis=-1, method='loop')
1 loops, best of 3: 638 ms per loop

No one method is best for all situations. That is one of the reasons I
started the Bottleneck package. I figured Cython could beat them all.



More information about the NumPy-Discussion mailing list