|
|
Re: off-line change detection & clustering in a time series of multidimensional data
Posted:
May 15, 2012 3:32 PM
|
|
I hope that you might get help from where Art suggests.
Exploring some data with subjects who were trying to hold still, standing on one foot, for a fixed amount of time (data collected as 2-D, shifts of balance point), I discovered that the reciprocol (seconds per mm) had more "normal" properties than the direct measure (mm per 30 sec).
- see below for additional comments - [Please provide short lines if you post again...]
On Fri, 11 May 2012 23:50:54 -0700 (PDT), moo.marc@gmail.com wrote:
>Hi, I've been researching this for a week (my background is theoretical physics so I have very limited knowledge in stats) and I haven't really found what I want so I would appreciate tips/references. > >I want to detect changes in a mostly stationary process (3-d motion of someone that is trying to be still), and get a "good" partition of the time series into intervals based on when changes occurred, i.e. stationary intervals. I know there are some algorithms for change detection, but most of what I found was for on-line detection and didn't seem optimal for partitioning, but maybe I haven't found the right one. Also if slow movement occurs, it needs to be detected and split it into chunks that "don't move much". Then there are numerous clustering algorithms but I haven't found any that would respect the time ordering of the data points: meaning they should enforce clusters to be time intervals, even though they would be created based the 3-d distances. > >Also, I've thought of using the distribution of d(i, i+1) (distance between consecutive data points in time) as the "H_0" distribution (if there is no change): since I expect few changes, either fast, in which case only a few distances would be larger than the typical "errors" when stationary, or perhaps sometimes slow, in which case the consecutive points would still be close to each other. Then I can compare this distribution with a cluster distribution of distances between its elements, say with a Kolmogorov-Smirnov test to decide if it's a "good" cluster. I'm thinking this could be used in a hierarchical clustering procedure to know when to merge/split clusters (for an agglomerative/divisive method respectively), and when to stop. But I'm still not sure how to look for clusters. I'm not at all convinced a greedy hierarchical method would perform well here. >
Looking at differences (slope, first derivative) is common. Sometimes, change-in-differences (acceleration, 2nd derivative).
>I guess I'll stop here and see what people think. Summarizing: > >1a. Is there a good off-line change detection algorithm that does "good" partitioning of multidimensional time series?
What is your purpose? A smoothed average should yield simple ranges that would serve for a lot of ends.
>1b. Or a clustering algorithm that can enforce clusters to be consecutive time intervals while using distances from another space for the similarity measure? I'd also prefer a non-parametric method, and one that "justifies" cluster choices (e.g. Kolmogorov-Smirnov or permutation tests). >2. Does it make sense (has it beed done before, is there a name for this, etc.) to use the distribution of distances between consecutive points to estimate the "no-change" variability? > >Thanks!
-- Rich Ulrich
|
|