Date: Nov 25, 2012 6:03 PM Author: Luis A. Afonso Subject: Behrens-Fisher by Intra-Permutations Generalizing the Fisher?s Exact Permutation Method (FEPM): Application to the Behrens-Fisher Problem

Introduction

We intend to devise a paradigm to solve the Behrens-Fisher problem (BFP) consisting in to find a confidence interval for the difference on means of two independent normal samples, X~N(mu1, sigma1):n, Y~N(mu2, sigma2):m by permuting freely the items of the samples. This feature is akin to the Fisher?s Permutation but this one is restricted to distributions with same dispersions. The present permutation method only allows switches among the items the sample they belong. An important feature is that, for each pseudo-sample, the individual dispersion is kept unchanged.

The Intra-Permutation Method.

The arrival coefficients W( )

Let be X= X1, ?, Xm and chose at random and exhaustively all m without replacement, affecting each one by the index Wx(j)= j/(m*(m+1)/2 where j is the order the item is chosen. The same for Y= Y1, . . .,Yn, Wy(j)= j/(n*(n+1)/2.

Noting that

_____mmX = E(Sum (Wx(j)*X(j))) = E(Xhat)

_____mmY = E(Sum (Wy(j)*Y(j))) = E(Yhat)

Where

_______Xhat= (X1 + ?+ Xn)/m

_______Yhat= (Y1 + ? +Ym)/n

we view mmX - mmY as an element that can sample the r.v. D = E(X) - E(Y), obtaining D* and the respective 5% CI, as shown below by 40000 repetitions.

Results

X~N(4, 10^2):20, Y=N(0, 1^2):40

__Xhat___Yhat____D_____D*_____5%CI*____centre*

__4.099__0.093__4.007__4.000 __[1.29, 6.77]__4.03

__6.630_-0.055__6.685__6.694__ [4.48, 8.86]__6.67

__3.907_-0.020__3.927__3.902__ [2.04, 5.79]__3.92

X~N(4, 10^2):40, Y~N(0, 1^2):20

__4.394__0.395__3.999__4.002__[2.35, 5.71]__4.03

__1.450_-0.409__1.859__1.859__[0.29, 4.83]__2.56**

__4.866_-0.255__5.121__5.129__[3.34, 6.96]__5.15

X~N(4, 10^2):100, Y~N(0, 1^2):50

__3.493_-0.113__3.606__3.605__[2.40, 4.83]__3.61

__2.963_-0.126__3.089__3.088__[2.04, 4.14]__3.09

__5.785__0.041_ 5.744__5.745__[4.58, 6.92]__5.75

X~N(5, 7^2): 30, Y~N(1, 1^2):20

__4.278__1.262__3.015__3.019__[1.57, 4.48]__3.02

__6.279__0.509__5.771__5.780__[3.99, 7.67]__5.83

__2.633__0.692__1.941__1.936__[0.69, 3.24]__1.96

Comments

The pair of source samples from which the permutations are get, do have difference of means noted by D. The procedure allows us to obtain 5% significance Confidence Intervals for the difference of means, which centres are well in accordance with D, being of course irrelevant given the evaluation procedure. One (out of 12) doesn?t follow this regularity, when 1/100 standard deviations is present.

The CI bounds are those the empirical distribution provides. Summing-up we feel that intra-permutations could be a far reaching method if properly scrutinized.

Luis A. Afonso

REM "BF-intra"

CLS

REM

INPUT " m1 , stdev1 , n1 "; m1, s1, n1

INPUT " m2 , stdev2 , n2 "; m2, s2, n2

REM

DIM X(n1), Y(n2), XX(n1), YY(n2)

REM

DIM W(8001), W1(n1), W2(n2)

REM

pi = 4 * ATN(1)

REM

INPUT " How many "; many

REM

REM

RANDOMIZE TIMER

sumX = 0: sumY = 0

FOR i = 1 TO n1

a = RND

aa = SQR(-2 * LOG(a))

X(i) = m1 + s1 * aa * COS(2 * pi * RND)

sumX = sumX + X(i)

NEXT i

FOR i = 1 TO n1

W1(i) = i / (.5 * n1 * (n1 + 1))

NEXT i

FOR i = 1 TO n2

W2(i) = i / (.5 * n2 * (n2 + 1))

NEXT i

REM

FOR i = 1 TO n2

a = RND

aa = SQR(-2 * LOG(a))

Y(i) = m2 + s2 * aa * COS(2 * pi * RND)

sumY = sumY + Y(i)

NEXT i

FOR i = 1 TO n2

W2(i) = i / (.5 * n2 * (n2 + 1))

NEXT i

U = sumX / n1: V = sumY / n2

LOCATE 9, 33

PRINT USING "##.### "; U; V; U - V

REM

REM

FOR j = 1 TO many

RANDOMIZE TIMER

LOCATE 10, 48: PRINT USING "#######"; many - j

FOR ii = 1 TO n1: XX(ii) = X(ii): NEXT ii

FOR ii = 1 TO n2: YY(ii) = Y(ii): NEXT ii

sumXX = 0: Ex = 0: sumYY = 0: Ey = 0

FOR k = 1 TO n1 - 1

1 g = INT(RND * n1) + 1

IF XX(g) = 7777777 THEN GOTO 1

sumXX = sumXX + XX(g)

Ex = Ex + W1(k) * XX(g)

XX(g) = 7777777

NEXT k

remain = sumX - sumXX

Ex = Ex + W1(n1) * remain

REM

FOR k = 1 TO n2 - 1

2 g = INT(RND * n2) + 1

IF YY(g) = 7777777 THEN GOTO 2

sumYY = sumYY + YY(g)

Ey = Ey + W2(k) * YY(g)

YY(g) = 7777777

NEXT k

remain = sumY - sumYY

Ey = Ey + W2(n2) * remain

E = Ex - Ey

IF E < 0 THEN GOTO 11

W = INT(100 * E) + 1

W(W) = W(W) + 1

md = md + E / many

11 NEXT j

REM

c(1) = .025 * many

c(2) = .975 * many

FOR cc = 1 TO 2

su = 0

FOR t = 0 TO 8000

su = su + W(t)

IF su > c(cc) THEN GOTO 22

NEXT t

22 tw(cc) = t / 100: rr(cc) = su / many

NEXT cc

LOCATE 10, 40: COLOR 14

PRINT USING " ##.### "; md

LOCATE 13, 20

PRINT USING " [ ##.## (#.###)"; tw(1); rr(1);

PRINT USING " ##.## (#.###) ]"; tw(2); rr(2)

COLOR 7

END