[Numpy-discussion] moving window product

Mon Mar 21 14:27:41 EDT 2011

On Mon, Mar 21, 2011 at 11:57 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Mon, Mar 21, 2011 at 10:34 AM, Brent Pedersen <bpederse at gmail.com> wrote:
>> On Mon, Mar 21, 2011 at 11:19 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>>> On Mon, Mar 21, 2011 at 10:10 AM, Brent Pedersen <bpederse at gmail.com> wrote:
>>>> hi, is there a way to take the product along a 1-d array in a moving
>>>> window? -- similar to convolve, with product in place of sum?
>>>> currently, i'm column_stacking the array with offsets of itself into
>>>> window_size columns and then taking the product at axis 1.
>>>> like::
>>>>
>>>>  w = np.column_stack(a[i:-window_size+i] for i in range(0, window_size))
>>>>  window_product = np.product(w, axis=1)
>>>>
>>>> but then there are the edge effects/array size issues--like those
>>>> handled in np.convolve.
>>>> it there something in numpy/scipy that addresses this. or that does
>>>> the column_stacking with an offset?
>>>
>>> The Bottleneck package has a fast moving window sum (bn.move_sum and
>>> bn.move_nansum). You could use that along with
>>>
>>>>> a = np.random.rand(5)
>>>>> a.prod()
>>>   0.015877866878931741
>>>>> np.exp(np.log(a).sum())
>>>   0.015877866878931751
>>>
>>> Or you could use strides or scipy.ndimage as in
>>> https://github.com/kwgoodman/bottleneck/blob/master/bottleneck/slow/move.py
>>>
>>
>> ah yes, of course. thank you.
>>
>> def moving_product(a, window_size, mode="same"):
>>    return np.exp(np.convolve(np.log(a), np.ones(window_size), mode))
>>
>> i'll have a closer look at your strided version in bottleneck as well.
>
> I don't know what size problem you are working on or if speed is an
> issue, but here are some timings:
>
>>> a = np.random.rand(1000000)
>>> window_size = 1000
>>> timeit np.exp(np.convolve(np.log(a), np.ones(window_size), 'same'))
> 1 loops, best of 3: 889 ms per loop
>>> timeit np.exp(bn.move_sum(np.log(a), window_size))
> 10 loops, best of 3: 82.5 ms per loop
>
> Most all that time is spent in np.exp(np.log(a)):
>
>>> timeit bn.move_sum(a, window_size)
> 100 loops, best of 3: 3.72 ms per loop
>
> So I assume if I made a bn.move_prod the time would be around 200x
> compared to convolve.
>
> BTW, you could do the exp inplace:
>
>>> timeit b = bn.move_sum(np.log(a), window_size); np.exp(b, b)
> 10 loops, best of 3: 76.3 ms per loop
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

my current use-case is to do this 24 times on arrays of about 200K elements.
file IO is the major bottleneck.