Mailman 3 moving average, moving standard deviation, etc... - SciPy-User

newer
probably a stupid question: MatLab...

moving average, moving standard deviation, etc...

Matt Knox

Dec. 23, 2006

6:42 p.m.

Does anyone know of a slick way to calculate a simple moving average and/or moving standard deviation? And when I say slick, I mean loop-free way, because it would be fairly easy to code a looping way of doing it, it would just be really slow. To be more specific, when I say moving standard deviation, I mean for example... if A is a 1-dimensional array with 100 elements, and I'm using a window size of 10, then the result at index 10 would be A[0:10].std(), the result at index 11 would be A[1:11].std() , etc... And similarly for moving average, and so forth. Is their a general way to do these kind of things? Or would it have to be a special case for each type of calculation to avoid looping? Any help is greatly appreciated. Thanks in advance, - Matt

Show replies by date

Gael Varoquaux

December 2006

6:51 p.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

Hi, For all linear operations a convolution will do the trick. For non-linear operations, such as the standard deviation an ugly but efficient way of doing it would be: sqrt(A[0:-10]**2 + A[1;-9]**2 + ... ) Maybe there is a more beautiful way of writing this. Gaël On Sat, Dec 23, 2006 at 01:42:39PM -0500, Matt Knox wrote:

...

Does anyone know of a slick way to calculate a simple moving average and/or moving standard deviation? And when I say slick, I mean loop-free way, because it would be fairly easy to code a looping way of doing it, it would just be really slow.

...

To be more specific, when I say moving standard deviation, I mean for example...

...

if A is a 1-dimensional array with 100 elements, and I'm using a window size of 10, then the result at index 10 would be A[0:10].std(), the result at index 11 would be A[1:11].std() , etc...

...

And similarly for moving average, and so forth. Is their a general way to do these kind of things? Or would it have to be a special case for each type of calculation to avoid looping?

...

Any help is greatly appreciated. Thanks in advance,

...

- Matt

...

_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user

A. M. Archibald

7:07 p.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

On 23/12/06, Matt Knox <mattknox_ca@hotmail.com> wrote:

...

And similarly for moving average, and so forth. Is their a general way to do these kind of things? Or would it have to be a special case for each type of calculation to avoid looping?

I don't imagine it would be particularly efficient, but you could use fancy indexing (or some weird code I posted a few months ago to avoid copying) to construct a new array such that slices along one axis are the (overlapping) chunks of the original array. Then any operation you want to perform on all the chunks can just be performed along this axis of the new array. A. M. Archibald

Frank Palazzolo

12:46 p.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

Hi, The technique I've seen involves keeping some intermediate variables, and subtracting off the oldest value while adding on the newest. At each timestep then, you just update the variables and recalculate the result. xsum += (x[i] - x[i-window_size]) x2sum += (x[i]*x[i] - x[i-window_size]*x[i-window_size]) average = xsum/window_size stdev_squared = (x2sum - xsum*xsum)/window_size stdev = sqrt(stdev_squared) If you want the "sample" stdev, you can compute it directly, or use: stdev_sample_squared = window_size*stdev_squared/(window_size-1) stdev_sample = sqrt(stdev_squared) Don't get me started on this though... IMHO it's usually a waste of time. -Frank Matt Knox wrote:

...

Does anyone know of a slick way to calculate a simple moving average and/or moving standard deviation? And when I say slick, I mean loop-free way, because it would be fairly easy to code a looping way of doing it, it would just be really slow.

To be more specific, when I say moving standard deviation, I mean for example...

if A is a 1-dimensional array with 100 elements, and I'm using a window size of 10, then the result at index 10 would be A[0:10].std(), the result at index 11 would be A[1:11].std() , etc...

And similarly for moving average, and so forth. Is their a general way to do these kind of things? Or would it have to be a special case for each type of calculation to avoid looping?

Any help is greatly appreciated. Thanks in advance,

- Matt

_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user

Matt Knox

4:23 p.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

...

Hi,

The technique I've seen involves keeping some intermediate variables, and subtracting off the oldest value while adding on the newest. At each timestep then, you just update the variables and recalculate the result.

xsum += (x[i] - x[i-window_size]) x2sum += (x[i]*x[i] - x[i-window_size]*x[i-window_size]) average = xsum/window_size stdev_squared = (x2sum - xsum*xsum)/window_size stdev = sqrt(stdev_squared)

If you want the "sample" stdev, you can compute it directly, or use:

stdev_sample_squared = window_size*stdev_squared/(window_size-1) stdev_sample = sqrt(stdev_squared)

Don't get me started on this though... IMHO it's usually a waste of time.

-Frank

Ok, thanks. So it sounds like there is no easy way to do this generally without doing some looping in python (for the standard deviation anyway), or writing some C code. I'm more interested in the general strategy for doing these kinds of calculations than these specific examples themselves necessarily. Although they are useful for making pretty charts sometimes. - Matt

Stefan van der Walt

7:01 p.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

Hi Matt On Sun, Dec 24, 2006 at 04:23:08PM +0000, Matt Knox wrote:

...

Ok, thanks. So it sounds like there is no easy way to do this generally without doing some looping in python (for the standard deviation anyway), or writing some C code. I'm more interested in the general strategy for doing these kinds of calculations than these specific examples themselves necessarily. Although they are useful for making pretty charts sometimes.

Like someone mentioned earlier, you can do averaging using numpy.convolve. For the rest, C code may be the easiest option -- and with ctypes it's a breeze. Take a look at Albert's page, http://www.scipy.org/Cookbook/Ctypes2 and let me know if you need further examples. Cheers Stéfan

Frank Palazzolo

2:18 a.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

Oops - I just realized that you said "no looping", as in "any looping is done inside the API function." Sorry about that - I don't know of any way to do it that way. -Frank

A. M. Archibald

2:53 a.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

On 24/12/06, Matt Knox <mattknox_ca@hotmail.com> wrote:

...

Ok, thanks. So it sounds like there is no easy way to do this generally without doing some looping in python (for the standard deviation anyway), or writing some C code. I'm more interested in the general strategy for doing these kinds of calculations than these specific examples themselves necessarily. Although they are useful for making pretty charts sometimes.

Fancy indexing will do the job, more or less. This is the general idea I was trying to get at: chunk = 10 data = arange(10000)/10000. indices = arange(len(data)-chunk+1)[:,newaxis]+arange(chunk)[newaxis,:] data_chunks = data[indices] average(data_chunks,axis=0) var(data_chunks,axis=0) and so on... The only problem is that it expands your data rather a lot. Personally I wouldn't worry about that until my application was written. I posted, some time ago, a function that would allow the easy construction of data_chunks as a view of the original array, that is, with no data copying. (Sorry, I don't remember how to find the thread.) A. M. Archibald

Stefan van der Walt

8:33 a.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

On Sun, Dec 24, 2006 at 10:53:04PM -0400, A. M. Archibald wrote:

...

On 24/12/06, Matt Knox <mattknox_ca@hotmail.com> wrote:

...
Ok, thanks. So it sounds like there is no easy way to do this generally without doing some looping in python (for the standard deviation anyway), or writing some C code. I'm more interested in the general strategy for doing these kinds of calculations than these specific examples themselves necessarily. Although they are useful for making pretty charts sometimes.

Fancy indexing will do the job, more or less.

This is the general idea I was trying to get at:

chunk = 10 data = arange(10000)/10000.

indices = arange(len(data)-chunk+1)[:,newaxis]+arange(chunk)[newaxis,:]

data_chunks = data[indices]

average(data_chunks,axis=0) var(data_chunks,axis=0) and so on...

The only problem is that it expands your data rather a lot. Personally I wouldn't worry about that until my application was written. I posted, some time ago, a function that would allow the easy construction of data_chunks as a view of the original array, that is, with no data copying. (Sorry, I don't remember how to find the thread.)

http://thread.gmane.org/gmane.comp.python.scientific.user/9570/focus=9579 Cheers Stéfan

Stefan van der Walt

11:20 a.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

On Wed, Dec 27, 2006 at 10:33:28AM +0200, Stefan van der Walt wrote:

...

On Sun, Dec 24, 2006 at 10:53:04PM -0400, A. M. Archibald wrote:

...
On 24/12/06, Matt Knox <mattknox_ca@hotmail.com> wrote:

...
Ok, thanks. So it sounds like there is no easy way to do this generally without doing some looping in python (for the standard deviation anyway), or writing some C code. I'm more interested in the general strategy for doing these kinds of calculations than these specific examples themselves necessarily. Although they are useful for making pretty charts sometimes.

Fancy indexing will do the job, more or less.

This is the general idea I was trying to get at:

chunk = 10 data = arange(10000)/10000.

indices = arange(len(data)-chunk+1)[:,newaxis]+arange(chunk)[newaxis,:]

data_chunks = data[indices]

average(data_chunks,axis=0) var(data_chunks,axis=0) and so on...

The only problem is that it expands your data rather a lot. Personally I wouldn't worry about that until my application was written. I posted, some time ago, a function that would allow the easy construction of data_chunks as a view of the original array, that is, with no data copying. (Sorry, I don't remember how to find the thread.)

http://thread.gmane.org/gmane.comp.python.scientific.user/9570/focus=9579

Or even better: http://thread.gmane.org/gmane.comp.python.numeric.general/12365/focus=12367 Cheers Stéfan

Denis Simakov

12:32 p.m.

New subject: [SciPy-user] moving average, moving standard deviation, etc...

For operations which may be reduced to sums (this includes average and std), you can use cumulative sum:

...

...
...
A = arange(10.0) w = 2 # window length C = zeros_like(A)

...

...
...
B = A.cumsum() C[w:] = (B[w:] - B[:-w]) / w # running average C[:w] = B[:w] / arange(1,w+1) # just average for first elements

...

...
...
print vstack((A, C)) [[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. ] [ 0. 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5]]

Denis On 12/23/06, Matt Knox <mattknox_ca@hotmail.com> wrote:

...

Does anyone know of a slick way to calculate a simple moving average and/or moving standard deviation? And when I say slick, I mean loop-free way, because it would be fairly easy to code a looping way of doing it, it would just be really slow.

To be more specific, when I say moving standard deviation, I mean for example...

if A is a 1-dimensional array with 100 elements, and I'm using a window size of 10, then the result at index 10 would be A[0:10].std(), the result at index 11 would be A[1:11].std() , etc...

And similarly for moving average, and so forth. Is their a general way to do these kind of things? Or would it have to be a special case for each type of calculation to avoid looping?

Any help is greatly appreciated. Thanks in advance,

- Matt

6612

Age (days ago)

6616

Last active (days ago)

List overview

Download

10 comments

6 participants

participants (6)

A. M. Archibald
Denis Simakov
Frank Palazzolo
Gael Varoquaux
Matt Knox
Stefan van der Walt

moving average, moving standard deviation, etc...

Frank Palazzolo

Frank Palazzolo

tags

participants (6)