Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.


Luis A. Afonso
Posts:
4,758
From:
LIsbon (Portugal)
Registered:
2/16/05


Twosamples difference: semiamplitude vs. sizes
Posted:
Aug 28, 2013 1:42 PM


Based on the test ___________t= [ XhatYhat d]/(sqrt(ssd)*k) Where Xhat, Yhat = sample mean values, d´= XhatYhat ssd= ssdX + ssdY the sum of squares deviations k=sqrt ((1/nX+1/nY)/(nX+nY2))____sample sizes we can accede to a CI (confidence interval) of the difference of Populations d if they are normal, independent and the standard deviations, sigma, are equal. In fact d is limited with probability alpha by: d´+ sqrt(ssd)*k*t(alpha/2) <= d <= d´ + sqrt(ssd)*k*t(1alpha/2) t( ) following a Student T distribution with nX+ nY 2 degrees of freedom (df). Progr. <GUNDER> This routine aims to illustrate two facts (for the total n=30 items distributed among the samples): ____1. The total sum of squared deviations, ssd= ssdX + ssdY, does follow a Chi squared n2 df. ____2. The CI semiamplitude, h, for the difference d between the Population mean values can be given by the product; h= sqrt(ssd) * k * t(1alpha/2, nX+ nY 2). The first factor is a sample variable, the second depends on nX and nY, the last one is a constant as long the total is set. Let be nX + nY = 30 : For the different 30 items could be distributed anong the two samples, we have: __________________________ Values nX + nY= 30 nX=nY=15__________k= 0.06901 ____ssd____16.80(0.0250)___46.99(0.9751) ____h______ 6.03(0.0254)___10.08(0.9752)
nX=10, nY=20_______k= 0.07319 ____ssd____16.80(0.0251)___47.00(0.9750) ____h_______6.39(0.0251)___10.69(0.9750)
nX= 5, nY=25_______k= 0.09258 ____ssd____16.80(0.0251)___46.98(0.9751) ____h_______8.09(0.0254)__ 13.52(0.9751)
It?s evident that (according with Theory) the quantity ssd doesn?t vary with the different chosen sizes, on contrary k does. Note With nX= nY= 15 or more CI we can written approximately as, with t= t(1alpha/2), error 3.2% or less, d´ sqrt(ssd)*t /n <= d <= d´+ sqrt(ssd)*t /n h the semiamplitude of the Population difference of means is such that h =(t/n)*sqrt(ssd) and depends exclusively from the equal variances estimated by either ssdX/(nX1) or ssdY/(nY1). Because we have h= 2.131450/15= 0.142*sqrt(ssd). Going to the 0.025, 0.975 quantiles of the Chi2, 28df we get [15.3, 44.5] and finally h = [2.17, 6.32]. In terms of an experiment planning the items the fiftyfifty strategy is the best one when we intend to obtain the narrowest semiinterval of no significance between the two means. Calculating k: ___________k_____ __264___0.10150_____246___0.08626____228___0.07802___ __2010__0.07319_____1812__0.07043____1614__0.06916___ __1515__0.06901___
__Even with sizes ratio as large as 6.5 = 26/4 the CI grows 30% relatively to the equal sample sizes. In spite being trivial results they could have some illustrative worth, I dare. Luis A. Afonso
REM "GUNDER" CLS DEFDBL AZ RANDOMIZE TIMER PRINT "_________ <GUNDER> _________" pi = 4 * ATN(1) INPUT " nX , nY (nX+nY)=30 "; nX, nY kappa = SQR((1 / nX + 1 / nY) / (nX + nY  2)) PRINT USING " kappa = ##.##### "; kappa t0 = 2.13125 all = 1000000 DIM x(nX), y(nY) DIM W(8004) DIM SM(8004) FOR T = 1 TO all LOCATE 10, 30 PRINT USING "########"; all  T ssx = 0: ssy = 0: sx = 0: sy = 0 FOR i = 1 TO nX aa = SQR(2 * LOG(RND)) rr = RND x(i) = aa * COS(2 * pi * rr) sx = sx + x(i) / nX ssx = ssx + x(i) * x(i) NEXT i FOR i = 1 TO nY aa = SQR(2 * LOG(RND)) rr = RND y(i) = aa * COS(2 * pi * rr) sy = sy + y(i) / nY ssy = ssy + y(i) * y(i) NEXT i ss1 = ssx  n * sx * sx ss2 = ssy  n * sy * sy: ss0 = ss1 + ss2 ss = INT(100 * ss0 + .5) IF ss > 8000 THEN ss = 8000 W(ss) = W(ss) + 1 / all semi = SQR(ss0) * t0 * kappa : REM semiamplitude SMI = INT(1000 * semi + .5) SM(SMI) = SM(SMI) + 1 / all NEXT T REM LOCATE 10, 20: PRINT " ssd="; u(1) = .025: u(2) = 1  u(1) FOR uu = 1 TO 2 sum = 0 FOR ti = 0 TO 8000 sum = sum + W(ti) IF sum > u(uu) THEN GOTO 7 NEXT ti 7 LOCATE 10, 10 + uu * 20 PRINT USING "##.## .#### "; ti / 100; sum; NEXT uu LOCATE 11, 10: PRINT "semiamplitude= "; FOR uu = 1 TO 2 sum = 0 FOR ti = 0 TO 8000 sum = sum + SM(ti) IF sum > u(uu) THEN GOTO 8 NEXT ti 8 LOCATE 11, 10 + uu * 20 PRINT USING "##.## .#### "; ti / 100; sum; NEXT uu : END



