Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



Re: MannKendall test statistic
Posted:
Aug 29, 2017 5:04 PM


On Thursday, September 30, 2004 at 2:21:45 PM UTC+1, Dr. Knud Werner wrote: > "A.G.McDowell" wrote: > > > > In article <gvrml0hs7vf2k9cpsg0gfsd3borvf7t6qn@4ax.com>, Richard Ulrich > > <Rich.Ulrich@comcast.net> writes > > > > > >[snip, Glen's request for clarification] > > >> > > >> AFAIK the ordinary MannKendall test statistics for a real sequence > > >> y = (y_1,\ldots,y_n) is calculated as > > >> > > >> S = \Sum_{i=1}^{n1} \Sum_{j=i+1}^n sgn(y_j  y_i) > > >> > > >> ie. it is the sum of the signs of all pairwise differences. If eg. > > >> you have the sequence (1,2,3,4,5,6,7,8,9,10) there are 45 pairs, > > >> each contributing a +1, giving the test statistics of +45. > > >> > > > > > >That reminds me of the Kendall tau, and the bubble > > >sort algorithm for counting interchanges. > > > > > >You can sort in O(n log(n)), right?  and carry along the > > >original ranks. > > > > > >That leaves the interesting problem of computing the > > >needed sum by comparing the sorted Ranks with the > > >original Ranks in less than O(n^2). If someone has > > >solved it for tau, then the same answer should apply > > >for this one (it seems to me, after a few minutes of > > >failing to find an algorithm). > > > > > > > > Suppose you presort the input, keeping track of the original positions, > > as suggested. I think you have reduced the problem to the following: > > starting with an empty array of size N, fill in the empty slots in some > > order determined by the data. Every time you fill in a slot, you want to > > know how many slots to its left and right have been filled. The slot > > each element fills in depends on its rank in sorted order. The order of > > filling in the slots is the reverse order of the data. So the number of > > filled slots to an item's left and right is the number of +ve and ve > > signs it contributes. > > > > To keep track of the number of filled slots to the left and right of > > arbitrary positions, maintain a series of arrays of length 1/2, 1/4, > > 1/8.. the size of the original array. Each cell in these arrays contains > > the number of filled slots in a corresponding pair, foursome, 8some... > > of slots in the original array. Filling in a slot in the main array > > requires you to modify one slot in each of the smaller arrays. To work > > out the number of filled slots to the left/right of any given location > > you need to sum up a collection of cells in the summary arrays that > > covers the required region in the original array, and again you can do > > this by reading just one cell from each summary array (work from the top > > down, using the cell that covers only  but probably not all  of the > > required region, leaving a gap to be filled in by the next array down). > > > > There are log n summary arrays so the cost of this stage is n log n  > > asymptotically the same as sorting > >  > > First of all, thanks for the kind replies. Using a binary tree indeed > does > the job. The following pseudocode might be of general interest, for it > calculates > the statistics in O(n log(n)). I've implemented it in C using some > balanced > binary tree structure described in the paper "Balanced Search Trees Made > Simple" > by Arne Andersson which was augmented in the spirit of section 14.1 from > "Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein. > Additional details might be provided on request. > > 1: int calcMannKenndallStatistic > 2: > 3: initBTree(tree) > 4: s = 0 > 5: x = (*getVal)(0) > 6: if (insBTree(&tree,x) == 0) goto error > 7: for (i = 1; i < n; i++) > 8: x = (*getVal)(i) > 9: if (insBTree(&tree,x) == 0) goto error > 10: s = s + (getMinRank(tree,x)  1)  (i + 1  getMaxRank(tree,x)) > 11: return s > > Best regards
For a description of O(n log(n)) calculation of a traditional Kendall's tau for different characteristics of data (duplicates, etc), you might want to consider this: https://www.researchgate.net/publication/226498574_Fast_algorithms_for_the_calculation_of_Kendall%27s_t which is an online fulltext source my 2005 paper on the subject (although after publication I discovered it had remarkable overlap with William R Knight's 1966 algorithm (abstract http://www.jstor.org/stable/2282833?seq=1#page_scan_tab_contents).
David Christensen



