help using np.correlate to produce correlograms.
Dear list, I'm trying to compute the cross correlation and cross correlograms from some signals. For that, I'm testing first np.correlate with some idealized traces (sine waves) that are exactly 1 ms separated from each other. You can have a look here: http://nbviewer.ipython.org/github/JoseGuzman/myIPythonNotebooks/blob/master... Unfortunately I am not able to retrieve the correct lag of 1 ms for the option 'full'. Strange enough, if I perform an autocorrelation of any of the signals,I obtain the correct value for a lags =0 ms. I' think I'm doing something wrong to obtain the lags. I would appreciate If somebody could help me here... Thanks in advance Jose  Jose Guzman http://www.ist.ac.at/~jguzman/
Hi, Le 08/12/2014 22:02, Jose Guzman a écrit :
I'm trying to compute the cross correlation and cross correlograms from some signals. For that, I'm testing first np.correlate with some idealized traces (sine waves) that are exactly 1 ms separated from each other. You can have a look here:
http://nbviewer.ipython.org/github/JoseGuzman/myIPythonNotebooks/blob/master...
Unfortunately I am not able to retrieve the correct lag of 1 ms for the option 'full'. Strange enough, if I perform an autocorrelation of any of the signals,I obtain the correct value for a lags =0 ms. I' think I'm doing something wrong to obtain the lags. I looked at your Notebook and I believe that you had an error in the definition of the delay. In you first cell, you were creating of delay of 20ms instead of 1ms (and because the sine is periodic, this was not obvious).
In addition, to get a good estimation of the delay with cross correlation, you need many perdiods. Here is a modification of your notebook : http://nbviewer.ipython.org/gist/pierrehaessig/e2dda384ae0e08943f9a I've updated the delay definition and the number of periods. Finally, you may be able to automate a bit your plot by using matplotlib's xcorr (which uses np.correlate) http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr best, Pierre
Dear Pierre, thank you very much for your time to correct my notebook and to point me in the direction of my wrong lag estimation. It has been very useful! Best Jose On 09/12/14 17:23, Pierre Haessig wrote:
Hi,
Le 08/12/2014 22:02, Jose Guzman a écrit :
I'm trying to compute the cross correlation and cross correlograms from some signals. For that, I'm testing first np.correlate with some idealized traces (sine waves) that are exactly 1 ms separated from each other. You can have a look here:
http://nbviewer.ipython.org/github/JoseGuzman/myIPythonNotebooks/blob/master...
Unfortunately I am not able to retrieve the correct lag of 1 ms for the option 'full'. Strange enough, if I perform an autocorrelation of any of the signals,I obtain the correct value for a lags =0 ms. I' think I'm doing something wrong to obtain the lags. I looked at your Notebook and I believe that you had an error in the definition of the delay. In you first cell, you were creating of delay of 20ms instead of 1ms (and because the sine is periodic, this was not obvious).
In addition, to get a good estimation of the delay with cross correlation, you need many perdiods.
Here is a modification of your notebook : http://nbviewer.ipython.org/gist/pierrehaessig/e2dda384ae0e08943f9a I've updated the delay definition and the number of periods.
Finally, you may be able to automate a bit your plot by using matplotlib's xcorr (which uses np.correlate) http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr
best, Pierre _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
 Jose Guzman http://www.ist.ac.at/~jguzman/
As a side note, I've still in mind the proposal I made back in 2013 to make np.correlate faster http://numpydiscussion.10968.n7.nabble.com/anotherdiscussiononnumpycorr... The basic idea is to enable the user to select the exact range of lags he wants. Unfortunately I didn't take the time to go further than the specification above... best, Pierre Le 10/12/2014 21:40, Jose Guzman a écrit :
Dear Pierre,
thank you very much for your time to correct my notebook and to point me in the direction of my wrong lag estimation. It has been very useful!
Best
Jose
I think it is a good time to discuss/implement further correlate improvements. I kind of favor the mode=(tuple of integers) api for your proposed change. Concerning the CAPI we probably need to add a new wrapper function but thats ok, the CAPI does not need to be as nice as the python API as it has far less and typically more experienced users. I also think its time we can remove the old_behavior flag which has been there since 1.4. Are there objections to that? Also on a side note, in 1.10 np.convolve/correlate has been significantly speed up if one of the sequences is less than 12 elements long. On 12/11/2014 09:54 AM, Pierre Haessig wrote:
As a side note, I've still in mind the proposal I made back in 2013 to make np.correlate faster
http://numpydiscussion.10968.n7.nabble.com/anotherdiscussiononnumpycorr...
The basic idea is to enable the user to select the exact range of lags he wants. Unfortunately I didn't take the time to go further than the specification above...
best, Pierre
Le 10/12/2014 21:40, Jose Guzman a écrit :
Dear Pierre,
thank you very much for your time to correct my notebook and to point me in the direction of my wrong lag estimation. It has been very useful!
Best
Jose
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
Le 11/12/2014 11:19, Julian Taylor a écrit :
Also on a side note, in 1.10 np.convolve/correlate has been significantly speed up if one of the sequences is less than 12 elements Interesting! What is the origin of this speed up, and why a magic number 12?
 Pierre
On 12/11/2014 03:24 PM, Pierre Haessig wrote:
Le 11/12/2014 11:19, Julian Taylor a écrit :
Also on a side note, in 1.10 np.convolve/correlate has been significantly speed up if one of the sequences is less than 12 elements Interesting! What is the origin of this speed up, and why a magic number 12?
previously numpy called dot for the convolution part, this is fine for large convolutions as dot goes out to BLAS which is superfast. For small convolutions unfortunately it is terrible as generic dot in BLAS libraries have enormous overheads they only amortize on large data. So one part was computing the dot in a simple numpy internal loop if the data is small. The second part is the number of registers typical machines have, e.g. amd64 has 16 floating point registers. If you can put all elements of a convolution kernel into these registers you save reloading them from stack on each iteration. 11 is the largest number I could reliably use without the compiler spilling them to the stack.
Le 11/12/2014 15:39, Julian Taylor a écrit :
previously numpy called dot for the convolution part, this is fine for large convolutions as dot goes out to BLAS which is superfast. For small convolutions unfortunately it is terrible as generic dot in BLAS libraries have enormous overheads they only amortize on large data. So one part was computing the dot in a simple numpy internal loop if the data is small.
The second part is the number of registers typical machines have, e.g. amd64 has 16 floating point registers. If you can put all elements of a convolution kernel into these registers you save reloading them from stack on each iteration. 11 is the largest number I could reliably use without the compiler spilling them to the stack. Thanks Julian!
On 11/12/14 09:54, Pierre Haessig wrote:
The basic idea is to enable the user to select the exact range of lags he wants. Unfortunately I didn't take the time to go further than the specification above...
I would be particularly interested in computing crosscorrelations in a range of +4000 sampling points lags. Unfortunately, my crosscorrelations require vectors of ~8e6 of points, and np.correlate performs very slowly if I compute the whole range. I also heard that a faster alternative to compute the crosscorrelation is to perform the product of the Fourier transform of the 2 vectors and then performing the inverse Fourier of the result. Best Jose  Jose Guzman http://www.ist.ac.at/~jguzman/
On 11.12.2014 19:01, Jose Guzman wrote:
On 11/12/14 09:54, Pierre Haessig wrote:
The basic idea is to enable the user to select the exact range of lags he wants. Unfortunately I didn't take the time to go further than the specification above...
I would be particularly interested in computing crosscorrelations in a range of +4000 sampling points lags. Unfortunately, my crosscorrelations require vectors of ~8e6 of points, and np.correlate performs very slowly if I compute the whole range.
I also heard that a faster alternative to compute the crosscorrelation is to perform the product of the Fourier transform of the 2 vectors and then performing the inverse Fourier of the result.
Large convolutions/correlations are generally faster in fourier space as they have O(NlogN) instead of O(N^2) complexity, for 1e6 points this should be very significant. You can use scipy.signal.fftconvolve to do that conveniently (with performance optimal zero padding). Convolution of a flipped input (and conjugated?) is the same as a correlation.
participants (3)

Jose Guzman

Julian Taylor

Pierre Haessig