Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Two-samples difference: semi-amplitude vs. sizes
Replies: 0

 Luis A. Afonso Posts: 4,758 From: LIsbon (Portugal) Registered: 2/16/05
Two-samples difference: semi-amplitude vs. sizes
Posted: Aug 28, 2013 1:42 PM

Based on the test
___________t= [ Xhat-Yhat- d]/(sqrt(ssd)*k)
Where
Xhat, Yhat = sample mean values, d´= Xhat-Yhat
ssd= ssdX + ssdY the sum of squares deviations
k=sqrt ((1/nX+1/nY)/(nX+nY-2))____sample sizes
we can accede to a CI (confidence interval) of the difference of Populations d if they are normal, independent and the standard deviations, sigma, are equal. In fact d is limited with probability alpha by:
d´+ sqrt(ssd)*k*t(alpha/2) <= d <= d´ + sqrt(ssd)*k*t(1-alpha/2)
t( ) following a Student T distribution with nX+ nY -2 degrees of freedom (df).
Progr. <GUNDER>
This routine aims to illustrate two facts (for the total n=30 items distributed among the samples):
____1. The total sum of squared deviations, ssd= ssdX + ssdY, does follow a Chi squared n-2 df.
____2. The CI semi-amplitude, h, for the difference d between the Population mean values can be given by the product; h= sqrt(ssd) * k * t(1-alpha/2, nX+ nY- 2). The first factor is a sample variable, the second depends on nX and nY, the last one is a constant as long the total is set. Let be nX + nY = 30 : For the different 30 items could be distributed anong the two samples, we have:
__________________________
Values nX + nY= 30
nX=nY=15__________k= 0.06901
____ssd____16.80(0.0250)___46.99(0.9751)
____h______ 6.03(0.0254)___10.08(0.9752)

nX=10, nY=20_______k= 0.07319
____ssd____16.80(0.0251)___47.00(0.9750)
____h_______6.39(0.0251)___10.69(0.9750)

nX= 5, nY=25_______k= 0.09258
____ssd____16.80(0.0251)___46.98(0.9751)
____h_______8.09(0.0254)__ 13.52(0.9751)

It?s evident that (according with Theory) the quantity ssd doesn?t vary with the different chosen sizes, on contrary k does.
Note
With nX= nY= 15 or more CI we can written approximately as, with t= t(1-alpha/2), error 3.2% or less,
d´- sqrt(ssd)*t /n <= d <= d´+ sqrt(ssd)*t /n
h the semi-amplitude of the Population difference of means is such that h =(t/n)*sqrt(ssd) and depends exclusively from the equal variances estimated by either ssdX/(nX-1) or ssdY/(nY-1). Because we have h= 2.131450/15= 0.142*sqrt(ssd). Going to the 0.025, 0.975 quantiles of the Chi2, 28df we get [15.3, 44.5] and finally h = [2.17, 6.32].
In terms of an experiment planning the items the fifty-fifty strategy is the best one when we intend to obtain the narrowest semi-interval of no significance between the two means. Calculating k:
___________k_____
__26-4___0.10150_____24-6___0.08626____22-8___0.07802___
__20-10__0.07319_____18-12__0.07043____16-14__0.06916___
__15-15__0.06901___

__Even with sizes ratio as large as 6.5 = 26/4 the CI grows 30% relatively to the equal sample sizes.
In spite being trivial results they could have some illustrative worth, I dare.
Luis A. Afonso

REM "GUNDER"
CLS
DEFDBL A-Z
RANDOMIZE TIMER
PRINT "_________ <GUNDER> _________"
pi = 4 * ATN(1)
INPUT " nX , nY (nX+nY)=30 "; nX, nY
kappa = SQR((1 / nX + 1 / nY) / (nX + nY - 2))
PRINT USING " kappa = ##.##### "; kappa
t0 = 2.13125
all = 1000000
DIM x(nX), y(nY)
DIM W(8004)
DIM SM(8004)
FOR T = 1 TO all
LOCATE 10, 30
PRINT USING "########"; all - T
ssx = 0: ssy = 0: sx = 0: sy = 0
FOR i = 1 TO nX
aa = SQR(-2 * LOG(RND))
rr = RND
x(i) = aa * COS(2 * pi * rr)
sx = sx + x(i) / nX
ssx = ssx + x(i) * x(i)
NEXT i
FOR i = 1 TO nY
aa = SQR(-2 * LOG(RND))
rr = RND
y(i) = aa * COS(2 * pi * rr)
sy = sy + y(i) / nY
ssy = ssy + y(i) * y(i)
NEXT i
ss1 = ssx - n * sx * sx
ss2 = ssy - n * sy * sy: ss0 = ss1 + ss2
ss = INT(100 * ss0 + .5)
IF ss > 8000 THEN ss = 8000
W(ss) = W(ss) + 1 / all
semi = SQR(ss0) * t0 * kappa : REM semi-amplitude
SMI = INT(1000 * semi + .5)
SM(SMI) = SM(SMI) + 1 / all
NEXT T
REM
LOCATE 10, 20: PRINT " ssd=";
u(1) = .025: u(2) = 1 - u(1)
FOR uu = 1 TO 2
sum = 0
FOR ti = 0 TO 8000
sum = sum + W(ti)
IF sum > u(uu) THEN GOTO 7
NEXT ti
7 LOCATE 10, 10 + uu * 20
PRINT USING "##.## .#### "; ti / 100; sum;
NEXT uu
LOCATE 11, 10: PRINT "semi-amplitude= ";
FOR uu = 1 TO 2
sum = 0
FOR ti = 0 TO 8000
sum = sum + SM(ti)
IF sum > u(uu) THEN GOTO 8
NEXT ti
8 LOCATE 11, 10 + uu * 20
PRINT USING "##.## .#### "; ti / 100; sum;
NEXT uu : END