Generalizing the Fisher?s Exact Permutation Method (FEPM): Application to the Behrens-Fisher Problem
We intend to devise a paradigm to solve the Behrens-Fisher problem (BFP) consisting in to find a confidence interval for the difference on means of two independent normal samples, X~N(mu1, sigma1):n, Y~N(mu2, sigma2):m by permuting freely the items of the samples. This feature is akin to the Fisher?s Permutation but this one is restricted to distributions with same dispersions. The present permutation method only allows switches among the items the sample they belong. An important feature is that, for each pseudo-sample, the individual dispersion is kept unchanged.
The Intra-Permutation Method. The arrival coefficients W( )
Let be X= X1, ?, Xm and chose at random and exhaustively all m without replacement, affecting each one by the index Wx(j)= j/(m*(m+1)/2 where j is the order the item is chosen. The same for Y= Y1, . . .,Yn, Wy(j)= j/(n*(n+1)/2. Noting that _____mmX = E(Sum (Wx(j)*X(j))) = E(Xhat) _____mmY = E(Sum (Wy(j)*Y(j))) = E(Yhat) Where _______Xhat= (X1 + ?+ Xn)/m _______Yhat= (Y1 + ? +Ym)/n we view mmX - mmY as an element that can sample the r.v. D = E(X) - E(Y), obtaining D* and the respective 5% CI, as shown below by 40000 repetitions.
The pair of source samples from which the permutations are get, do have difference of means noted by D. The procedure allows us to obtain 5% significance Confidence Intervals for the difference of means, which centres are well in accordance with D, being of course irrelevant given the evaluation procedure. One (out of 12) doesn?t follow this regularity, when 1/100 standard deviations is present. The CI bounds are those the empirical distribution provides. Summing-up we feel that intra-permutations could be a far reaching method if properly scrutinized.
Luis A. Afonso
REM "BF-intra" CLS REM INPUT " m1 , stdev1 , n1 "; m1, s1, n1 INPUT " m2 , stdev2 , n2 "; m2, s2, n2 REM DIM X(n1), Y(n2), XX(n1), YY(n2) REM DIM W(8001), W1(n1), W2(n2) REM pi = 4 * ATN(1) REM INPUT " How many "; many REM REM RANDOMIZE TIMER sumX = 0: sumY = 0 FOR i = 1 TO n1 a = RND aa = SQR(-2 * LOG(a)) X(i) = m1 + s1 * aa * COS(2 * pi * RND) sumX = sumX + X(i) NEXT i FOR i = 1 TO n1 W1(i) = i / (.5 * n1 * (n1 + 1)) NEXT i FOR i = 1 TO n2 W2(i) = i / (.5 * n2 * (n2 + 1)) NEXT i REM FOR i = 1 TO n2 a = RND aa = SQR(-2 * LOG(a)) Y(i) = m2 + s2 * aa * COS(2 * pi * RND) sumY = sumY + Y(i) NEXT i FOR i = 1 TO n2 W2(i) = i / (.5 * n2 * (n2 + 1)) NEXT i U = sumX / n1: V = sumY / n2 LOCATE 9, 33 PRINT USING "##.### "; U; V; U - V REM REM FOR j = 1 TO many RANDOMIZE TIMER LOCATE 10, 48: PRINT USING "#######"; many - j FOR ii = 1 TO n1: XX(ii) = X(ii): NEXT ii FOR ii = 1 TO n2: YY(ii) = Y(ii): NEXT ii sumXX = 0: Ex = 0: sumYY = 0: Ey = 0 FOR k = 1 TO n1 - 1 1 g = INT(RND * n1) + 1 IF XX(g) = 7777777 THEN GOTO 1 sumXX = sumXX + XX(g) Ex = Ex + W1(k) * XX(g) XX(g) = 7777777 NEXT k remain = sumX - sumXX Ex = Ex + W1(n1) * remain REM FOR k = 1 TO n2 - 1 2 g = INT(RND * n2) + 1 IF YY(g) = 7777777 THEN GOTO 2 sumYY = sumYY + YY(g) Ey = Ey + W2(k) * YY(g) YY(g) = 7777777 NEXT k remain = sumY - sumYY Ey = Ey + W2(n2) * remain E = Ex - Ey IF E < 0 THEN GOTO 11 W = INT(100 * E) + 1 W(W) = W(W) + 1 md = md + E / many 11 NEXT j REM c(1) = .025 * many c(2) = .975 * many FOR cc = 1 TO 2 su = 0 FOR t = 0 TO 8000 su = su + W(t) IF su > c(cc) THEN GOTO 22 NEXT t 22 tw(cc) = t / 100: rr(cc) = su / many NEXT cc LOCATE 10, 40: COLOR 14 PRINT USING " ##.### "; md LOCATE 13, 20 PRINT USING " [ ##.## (#.###)"; tw(1); rr(1); PRINT USING " ##.## (#.###) ]"; tw(2); rr(2) COLOR 7 END