Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Mann-Kendall test statistic
Replies: 1   Last Post: Aug 29, 2017 5:04 PM

 dchristensen1965@gmail.com Posts: 1 Registered: 8/29/17
Re: Mann-Kendall test statistic
Posted: Aug 29, 2017 5:04 PM

On Thursday, September 30, 2004 at 2:21:45 PM UTC+1, Dr. Knud Werner wrote:
> "A.G.McDowell" wrote:
> >
> > In article <gvrml0hs7vf2k9cpsg0gfsd3borvf7t6qn@4ax.com>, Richard Ulrich
> > <Rich.Ulrich@comcast.net> writes

> > >
> > >[snip, Glen's request for clarification]

> > >>
> > >> AFAIK the ordinary Mann-Kendall test statistics for a real sequence
> > >> y = (y_1,\ldots,y_n) is calculated as
> > >>
> > >> S = \Sum_{i=1}^{n-1} \Sum_{j=i+1}^n sgn(y_j - y_i)
> > >>
> > >> ie. it is the sum of the signs of all pairwise differences. If eg.
> > >> you have the sequence (1,2,3,4,5,6,7,8,9,10) there are 45 pairs,
> > >> each contributing a +1, giving the test statistics of +45.
> > >>

> > >
> > >That reminds me of the Kendall tau, and the bubble-
> > >sort algorithm for counting interchanges.
> > >
> > >You can sort in O(n log(n)), right? - and carry along the
> > >original ranks.
> > >
> > >That leaves the interesting problem of computing the
> > >needed sum by comparing the sorted Ranks with the
> > >original Ranks in less than O(n^2). If someone has
> > >solved it for tau, then the same answer should apply
> > >for this one (it seems to me, after a few minutes of
> > >failing to find an algorithm).
> > >
> > >

> > Suppose you pre-sort the input, keeping track of the original positions,
> > as suggested. I think you have reduced the problem to the following:
> > starting with an empty array of size N, fill in the empty slots in some
> > order determined by the data. Every time you fill in a slot, you want to
> > know how many slots to its left and right have been filled. The slot
> > each element fills in depends on its rank in sorted order. The order of
> > filling in the slots is the reverse order of the data. So the number of
> > filled slots to an item's left and right is the number of +ve and -ve
> > signs it contributes.
> >
> > To keep track of the number of filled slots to the left and right of
> > arbitrary positions, maintain a series of arrays of length 1/2, 1/4,
> > 1/8.. the size of the original array. Each cell in these arrays contains
> > the number of filled slots in a corresponding pair, four-some, 8-some...
> > of slots in the original array. Filling in a slot in the main array
> > requires you to modify one slot in each of the smaller arrays. To work
> > out the number of filled slots to the left/right of any given location
> > you need to sum up a collection of cells in the summary arrays that
> > covers the required region in the original array, and again you can do
> > this by reading just one cell from each summary array (work from the top
> > down, using the cell that covers only - but probably not all - of the
> > required region, leaving a gap to be filled in by the next array down).
> >
> > There are log n summary arrays so the cost of this stage is n log n -
> > asymptotically the same as sorting
> > --

>
> First of all, thanks for the kind replies. Using a binary tree indeed
> does
> the job. The following pseudocode might be of general interest, for it
> calculates
> the statistics in O(n log(n)). I've implemented it in C using some
> balanced
> binary tree structure described in the paper "Balanced Search Trees Made
> Simple"
> by Arne Andersson which was augmented in the spirit of section 14.1 from
> "Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein.
> Additional details might be provided on request.
>
> 1: int calcMannKenndallStatistic
> 2:
> 3: initBTree(tree)
> 4: s = 0
> 5: x = (*getVal)(0)
> 6: if (insBTree(&tree,x) == 0) goto error
> 7: for (i = 1; i < n; i++)
> 8: x = (*getVal)(i)
> 9: if (insBTree(&tree,x) == 0) goto error
> 10: s = s + (getMinRank(tree,x) - 1) - (i + 1 - getMaxRank(tree,x))
> 11: return s
>
> Best regards

For a description of O(n log(n)) calculation of a traditional Kendall's tau for different characteristics of data (duplicates, etc), you might want to consider this: https://www.researchgate.net/publication/226498574_Fast_algorithms_for_the_calculation_of_Kendall%27s_t which is an online full-text source my 2005 paper on the subject (although after publication I discovered it had remarkable overlap with William R Knight's 1966 algorithm (abstract http://www.jstor.org/stable/2282833?seq=1#page_scan_tab_contents).

David Christensen