[Numpy-discussion] custom accumlators

Fri Jan 5 22:23:57 EST 2007

Matt Knox wrote:
>>>>     
>>>>         
>>> You might want to look at frompyfunc:
>>>
>>>         def expmave2(x, k):
>>>             def expmave_sub(a, b):
>>>                 return a + k * (b - a)
>>>             return np.frompyfunc(expmave_sub, 2, 1).accumulate(x)
>>>
>>>
>>> It's amazing what you find when you dig around.
>>>       
>
> Thanks a lot everyone. This has been informative. For what it's worth, I did 
> some performance comparisons...
>
> import numpy as np
> import profile
>
> def expmave1(x, k):
>     def expmave_sub(a, b):
>         return b + k * (a - b)
>     return np.frompyfunc(expmave_sub, 2, 1).accumulate(x).astype(x.dtype)
>
>
> def expmave2(x, k):
>     result = np.array(x, copy=True)
>     for i in range(1, result.size):
>        result[i] = result[i-1] + k * (result[i] - result[i-1])
>     return result
>
> testArray = np.cumprod(1 + np.random.normal(size=10000)/100)
>
>
> profile.run('expmave1(testArray, 0.2)')
> profile.run('expmave2(testArray, 0.2)')
>
> and the second function is faster, which I guess makes sense if frompyfunc is 
> pure python, although the first one does have a nice elegance to it I think.
>   
Are you sure about this? I ran this case using timeit, and the first one 
was 5 times or so *faster* than the second case. I just dug around and 
frompyfunc is acutally implemented in C, although it has to call back 
into python to execute the function being vectorized. Can you try using 
timeit instead of profile and see what you get? For example:

    a = np.cumprod(1 + np.random.normal(size=10000)/10)

    if __name__ == '__main__':
        from timeit import Timer
        print Timer('expmave1(a, .5)', 'from scratch import np, 
expmave1, a').timeit(10)
        print Timer('expmave2(a, .5)', 'from scratch import np, 
expmave2, a').timeit(10)

Anyway, I'm glad that all was helpful.

-tim

-tim