time series calculation in list comprehension?
Peter Otten
__peter__ at web.de
Sat Mar 11 11:21:03 CET 2006
Lonnie Princehouse wrote:
> You really want to use the value calculated for the i_th term in the
> (i+1)th term's evaluation.
It may sometimes be necessary to recalculate the average for every iteration
to avoid error accumulation. Another tradeoff with your optimization is
that it becomes harder to switch the accumulation function from average to
max, say.
> While it's not easy (or pretty) to store state between iterations in a
> list comprehension, this is the perfect use for a generator:
>
> def generator_to_list(f):
> return lambda *args,**keywords: list(f(*args,**keywords))
>
> @generator_to_list
> def moving_average(sequence, n):
> assert len(sequence) >= n and n > 0
> average = sum(sequence[:n]) / n
> yield average
> for i in xrange(1, len(sequence)-n+1):
> average += (sequence[i+n-1] - sequence[i-1]) / n
> yield average
Here are two more that work with arbitrary iterables:
from __future__ import division
from itertools import islice, tee, izip
from collections import deque
def window(items, n):
it = iter(items)
w = deque(islice(it, n-1))
for item in it:
w.append(item)
yield w # for a robust implementation:
# yield tuple(w)
w.popleft()
def moving_average1(items, n):
return (sum(w)/n for w in window(items, n))
def moving_average2(items, n):
first_items, last_items = tee(items)
accu = sum(islice(last_items, n-1))
for first, last in izip(first_items, last_items):
accu += last
yield accu/n
accu -= first
While moving_average1() is even slower than your inefficient variant,
moving_average2() seems to be a tad faster than the efficient one.
Peter
More information about the Python-list
mailing list