Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.

Topic: off-line change detection & clustering in a time series of multidimensional
data

Replies: 18   Last Post: May 21, 2012 1:23 PM

 Messages: [ Previous | Next ]
 Richard Ulrich Posts: 2,800 Registered: 12/13/04
Re: off-line change detection & clustering in a time series of multidimensional data
Posted: May 15, 2012 3:32 PM

I hope that you might get help from where Art suggests.

Exploring some data with subjects who were trying to hold
still, standing on one foot, for a fixed amount of time (data
collected as 2-D, shifts of balance point), I discovered that
the reciprocol (seconds per mm) had more "normal" properties
than the direct measure (mm per 30 sec).

[Please provide short lines if you post again...]

On Fri, 11 May 2012 23:50:54 -0700 (PDT), moo.marc@gmail.com wrote:

>Hi, I've been researching this for a week (my background is
theoretical physics so I have very limited knowledge in stats) and I
haven't really found what I want so I would appreciate
tips/references.
>
>I want to detect changes in a mostly stationary process (3-d motion

of someone that is trying to be still), and get a "good" partition of
the time series into intervals based on when changes occurred, i.e.
stationary intervals. I know there are some algorithms for change
detection, but most of what I found was for on-line detection and
didn't seem optimal for partitioning, but maybe I haven't found the
right one. Also if slow movement occurs, it needs to be detected and
split it into chunks that "don't move much". Then there are numerous
clustering algorithms but I haven't found any that would respect the
time ordering of the data points: meaning they should enforce clusters
to be time intervals, even though they would be created based the 3-d
distances.
>
>Also, I've thought of using the distribution of d(i, i+1) (distance

between consecutive data points in time) as the "H_0" distribution (if
there is no change): since I expect few changes, either fast, in which
case only a few distances would be larger than the typical "errors"
when stationary, or perhaps sometimes slow, in which case the
consecutive points would still be close to each other. Then I can
compare this distribution with a cluster distribution of distances
between its elements, say with a Kolmogorov-Smirnov test to decide if
it's a "good" cluster. I'm thinking this could be used in a
hierarchical clustering procedure to know when to merge/split clusters
(for an agglomerative/divisive method respectively), and when to stop.
But I'm still not sure how to look for clusters. I'm not at all
convinced a greedy hierarchical method would perform well here.
>

Looking at differences (slope, first derivative) is common.
Sometimes, change-in-differences (acceleration, 2nd derivative).

>I guess I'll stop here and see what people think. Summarizing:
>
>1a. Is there a good off-line change detection algorithm that does

"good" partitioning of multidimensional time series?

What is your purpose? A smoothed average should yield
simple ranges that would serve for a lot of ends.

>1b. Or a clustering algorithm that can enforce clusters to be consecutive time intervals while using distances from another space for the similarity measure? I'd also prefer a non-parametric method, and one that "justifies" cluster choices (e.g. Kolmogorov-Smirnov or permutation tests).
>2. Does it make sense (has it beed done before, is there a name for this, etc.) to use the distribution of distances between consecutive points to estimate the "no-change" variability?
>
>Thanks!

--
Rich Ulrich