On Fri, Mar 12, 2010 at 10:46 AM, Robin <robince@gmail.com> wrote:
Hello,
While not directly Python related I am always impressed with the quality of scientific advice available on this list, and was hoping I could receive some...
I have a limited amount of an experimentally obtained time series from a biological system. I would like to come up with a generative model which would allow production of large quantities of data with statistics as similar as possible to the experimental data. The time series represents a position, and I am particularly interested in transient high velocity/acceleration events (which are often not very visible by eye in the position trace), so ideally any model should reproduce those with particular care.
An example plot of a small section of the data (pos vel and acc) (1s) is available here: http://i41.tinypic.com/ou42de.jpg
If it makes any difference it is sampled at 4kHz. I tried fitting a basic autoregressive model. An order 38 model reproduced the position signal visually quite well, but velocity and acceleration were far too regular. I tried fitting one to the velocity, but I think the events of interest are too far apart in bins so the order required is too large.
So, could anyone point me to anything that would be helpful in python (so far I did the AR with a matlab package I found)? Also any suggestions for how to proceed would be great - other than reading the wikipedia article I am completely new to this type of AR modelling. So far the only ideas I have involve either downsampling the signal (to try to reduce the order of AR model needed), or splitting it in frequency to low f/high f components and attempting to model them separately then recombine. Do either of these seem sensible?
Is it likely some non-linear model would be required (pos,vel and acc all have high kurtosis), or are normal AR models capable of recreating this kind of fine structure if tweaked sufficiently?
Thanks in advance for any pointers,
In statsmodels we are working on some time series analysis, but it is still a bit to early for real use. We have AR, but for this kind of data I would recommend scikits.talkbox which has a Levinson-Durbin recursion implemented that gives a more robust estimate of longer AR polynomial (maybe nitime also has it now.) I don't know of any implementation of non-linear models for time series analysis in python, e.g. a markov switching or threshold model, or of any models that would allow for fat-tailes or asymmetric shock distributions. If you just want to generate sample data with similar features, then this will be much easier than estimation. (I have some tentative simulation code for continuous time diffusion processes but not cleaned up) Your acceleration data looks like a GARCH process, that is the variance is autocorrelated but not (much) the mean. There also, I have an initial version but not yet good enough to be reliable.
From the graph, it also looks like the three observations are strongly related, so separate (univariate) modeling doesn't look like the most appropriate choice.
Josef
Cheers
Robin _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user