incompatible sizes when correlating two timeseries
![](https://secure.gravatar.com/avatar/6f3cb304671ae5b6ea04dfe0e7948651.jpg?s=120&d=mm&r=g)
Hello, I try to correlate two timeseries. I don not understand, why I get an error for incompatile size. both have the same frequencies. In [54]: x = series_01 In [55]: y = series_02 In [56]: diff = y.size-x.size In [57]: diff Out[57]: 0 In [58]: x.shape Out[58]: (15360,) In [59]: y.shape Out[59]: (15360,) In [60]: np.correlate(x, y) --------------------------------------------------------------------------- TimeSeriesCompatibilityError Traceback (most recent call last) D:\python\test.py in <module>() ----> 1 2 3 4 5 C:\Programme\pythonxy\python\lib\site-packages\numpy\core\numeric.pyc in correlate(a, v, mode) 513 """ 514 mode = _mode_from_name(mode) --> 515 return multiarray.correlate(a,v,mode) 516 517 C:\Programme\pythonxy\python\lib\site-packages\scikits\timeseries\tseries.pyc in __array_finalize__(self, obj) 459 def __array_finalize__(self,obj): 460 self._varshape = getattr(obj, '_varshape', ()) --> 461 MaskedArray.__array_finalize__(self, obj) 462 463 def _update_from(self, obj): C:\Programme\pythonxy\python\lib\site-packages\numpy\ma\core.pyc in __array_finalize__(self, obj) 1383 """ 1384 # Get main attributes ......... -> 1385 self._update_from(obj) 1386 if isinstance(obj, ndarray): 1387 odtype = obj.dtype C:\Programme\pythonxy\python\lib\site-packages\scikits\timeseries\tseries.pyc in _update_from(self, obj) 466 # Only update the dates if we don't have any 467 if not getattr(_dates, 'size', 0): --> 468 self.__setdates__(newdates) 469 MaskedArray._update_from(self, obj) 470 C:\Programme\pythonxy\python\lib\site-packages\scikits\timeseries\tseries.pyc in __setdates__(self, value) 662 if not varshape: 663 # We may be using the default: retry --> 664 varshape = self._varshape = get_varshape(self, value) 665 # Get the data length (independently of the nb of variables) 666 dsize = self.size // int(np.prod(varshape)) C:\Programme\pythonxy\python\lib\site-packages\scikits\timeseries\tseries.pyc in get_varshape(data, dates) 260 # More dates than data: not good 261 if (dates.size > data.size) or (data.ndim == 1): --> 262 raise TimeSeriesCompatibilityError(*err_args) 263 #.................... 264 dcumulshape = np.cumprod(dshape).tolist() TimeSeriesCompatibilityError: Incompatible sizes! (data: (1,) <> dates: (15360,)) Please give me a hint here. Timmie
![](https://secure.gravatar.com/avatar/eba6f06b5dfa5885c37acd2eb2089798.jpg?s=120&d=mm&r=g)
Hi Timmie, I think the error is somewhere hidden in your data. Try to reproduce the error with much smaller time series, or with a part of yours. Say size=(10,) or even less, and then represent them as string to examine the problem. At least np.correlate(z, z) should work with z=x or z=y, does it not? Bastian. Timmie wrote:
Hello, I try to correlate two timeseries.
I don not understand, why I get an error for incompatile size.
both have the same frequencies.
In [54]: x = series_01
In [55]: y = series_02
In [56]: diff = y.size-x.size
In [57]: diff Out[57]: 0
In [58]: x.shape Out[58]: (15360,)
In [59]: y.shape Out[59]: (15360,)
In [60]: np.correlate(x, y) --------------------------------------------------------------------------- TimeSeriesCompatibilityError Traceback (most recent call last)
D:\python\test.py in <module>() ----> 1 2 3 4 5
...
![](https://secure.gravatar.com/avatar/8759852b22df0efbb0a245b69402fdae.jpg?s=120&d=mm&r=g)
Timmie <timmichelsen <at> gmx-topmail.de> writes:
Hello, I try to correlate two timeseries.
I don not understand, why I get an error for incompatile size.
I would say this is a bug. Although I am not 100% certain the cause of it at the moment. I think it happens when the correlate function tries to create a new TimeSeries to store the result in and somehow the dates of the input TimeSeries get passed along to create the resulting TimeSeries (which will be of size 1). A simple work around for now is to just call np.correlate on the underlying raw array (using the .data attribute of the TimeSeries). Note that np.correlate will NOT work properly with MaskedArray's that contain masked value. In general you should assume functions from the top level numpy namespace will not work properly with masked values. Pierre, I think we should probably up-cast the TimeSeries to a plain MaskedArray when _update_from is called with dates of a different size than the data. I'm sure other functions in numpy crash on TimeSeries objects for the same reason. What do you think? - Matt
![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Thu, Dec 18, 2008 at 1:33 PM, Matt Knox <mattknox.ca@gmail.com> wrote:
Timmie <timmichelsen <at> gmx-topmail.de> writes:
Hello, I try to correlate two timeseries.
I don not understand, why I get an error for incompatile size.
I would say this is a bug. Although I am not 100% certain the cause of it at the moment. I think it happens when the correlate function tries to create a new TimeSeries to store the result in and somehow the dates of the input TimeSeries get passed along to create the resulting TimeSeries (which will be of size 1).
A simple work around for now is to just call np.correlate on the underlying raw array (using the .data attribute of the TimeSeries). Note that np.correlate will NOT work properly with MaskedArray's that contain masked value. In general you should assume functions from the top level numpy namespace will not work properly with masked values.
Tim, if you need them, there are some statistical functions that work for masked arrays in scipy.stats.mstats. They are not yet included in the new docs. But you can see what is available with import scipy.stats dir(scipy.stats.mstats) I don't know how well they work with TimeSeries. Josef
![](https://secure.gravatar.com/avatar/56b215661867f3b4f4a3b28077de66b3.jpg?s=120&d=mm&r=g)
import scipy.stats dir(scipy.stats.mstats)
I don't know how well they work with TimeSeries.
If you don't care about the dates, just use the functions on the .series attribute of your timeseries. I need to check how well mstats support TimeSeries, I probably won't have time before next year, though...
![](https://secure.gravatar.com/avatar/56b215661867f3b4f4a3b28077de66b3.jpg?s=120&d=mm&r=g)
On Dec 18, 2008, at 1:33 PM, Matt Knox wrote:
Timmie <timmichelsen <at> gmx-topmail.de> writes:
Hello, I try to correlate two timeseries.
I don not understand, why I get an error for incompatile size.
I would say this is a bug. Although I am not 100% certain the cause of it at the moment. I think it happens when the correlate function tries to create a new TimeSeries to store the result in and somehow the dates of the input TimeSeries get passed along to create the resulting TimeSeries (which will be of size 1).
np.correlate(x,y) returns a 1D array of size 1. Because x and y are TimeSeries, it tries to create a new series, but don't know what to do w/ the dates, so it chokes.
A simple work around for now is to just call np.correlate on the underlying raw array (using the .data attribute of the TimeSeries). Note that np.correlate will NOT work properly with MaskedArray's that contain masked value. In general you should assume functions from the top level numpy namespace will not work properly with masked values.
Indeed: when manipulating time series, if you don't need to keep track of the dates, just drop them by using .series (better than .data, as 'masked values shouldn't be trusted anyway'(TM)...)
Pierre, I think we should probably up-cast the TimeSeries to a plain MaskedArray when _update_from is called with dates of a different size than the data. I'm sure other functions in numpy crash on TimeSeries objects for the same reason. What do you think?
Right now, nothing. I need to see a better example. Here, yes, we could drop the dates and return a MaskedArray. In other cases, there's a legitimate reason for trying to output a TimeSeries.
participants (5)
-
Bastian Weber
-
josef.pktd@gmail.com
-
Matt Knox
-
Pierre GM
-
Timmie