Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.


Luis A. Afonso
Posts:
4,617
From:
LIsbon (Portugal)
Registered:
2/16/05


Difference of Two Population means Reproducibility
Posted:
Dec 11, 2012 5:29 PM


Difference of Two Population means Reproducibility (Repeat detection suitability)
Introduction: depending on the conditions through a NHST we are able to find or not a difference between two normal Populations (same standard deviation, sigma) with different mean values. Simulation (BoxMuller) results were made to illustrate this wellknown fact. ____n= common sample size: nX=nY ____0.5, 1.0, 1.5, 2.0 = muX , muY=0. ____% of values (out of 40´000) that is outside the 95% interval [0, left bound]. Then, it not includes 0.
__left bound = T(.975, df= nX+nY2) multiplied by sqrt[ (ssdX+ssdY)/(nX+ nY 2) * (1/nX+1/nY) ].
ssdX = sum squares deviations, sample X, size nX, simulate as it was drawn from X~N(muX, sigmaX). (similarly for Y~N(muY, sigmaY)) The sample sizes were set 10, 15, 20, 25, 30 and T = 2.101, 2.048, 2.024, 2.011, 2.002 respectively.
Results:
___nX=nY____sigmaX=sigmaY=1_______/40´000
__n=10, T=2.101(df=18) __0.5__18.3%__1.0__56.4__1.5__88.7__2.0__98.9_
__n=15, T=2.048 (28) __0.5__26.3%__1.0__75.8__1.5__97.8__2.0_100.0_
__n=20, T=2.024 (38) __0.5__33.5%__1.0__86.9__1.5__99.6__2.0_100.0_
__n=25, T=2.011 (48) __0.5__41.0%__1.0__93.5__1.5_100.0__2.0_100.0_
__n=30, T=2.002 (58) __0.5__47.7%__1.0__96.8__1.5_100.0__2.0_100.0_
___n=30_______sigmaX=sigmaY=2_______ __0.5__15.3%__1.0__47.0__1.5__81.2__2.0__96.7_
Large sizes and small Population standard deviations allows to detect small mean differences, in contrast if sizes are small and stdev large only large differences are completely detectable: see two later examples. (all people are aware. . .).
Luis A. Afonso
REM "Tduo" CLS DEFDBL AZ pi = 4 * ATN(1) PRINT PRINT " *********** Tduo:********************" PRINT " _________________________________________________ " PRINT " _________________________________________________ " PRINT PRINT " X~N(muX,1):nX vs Y~N(0,1):nY difference of means " PRINT " without previous variance selection by F test " PRINT " AIM: To find delta significance frequency " PRINT " d=(XhatYhat)/s" PRINT " s^2 = [(ssdX+ssdY)/(nX+nY2)]*(1/nX+1/nY) " PRINT " T(.975,nX+nY2)*s < = XhatYhat (right tail) " PRINT " _________________________________________________ " PRINT " _________________________________________________ " PRINT INPUT " nX, nY "; nX, nY n = (1 / nX + 1 / nY) / (nX + nY  2) INPUT " T(.975,nX+nY2) "; T muY = 0 INPUT " ALL "; ali REM FOR muX = .5 TO 2 STEP .5 RANDOMIZE TIMER FOR j = 1 TO ali sX = 0: sxx = 0: sy = 0: syy = 0 REM DATA FOR i = 1 TO nX 1 aa = RND IF aa < 1E15 THEN GOTO 1 aa = SQR(2 * LOG(aa)) bb = RND x = muX + aa * COS(2 * pi * bb) sX = sX + x / nX: sxx = sxx + x * x NEXT i FOR i = 1 TO nY 2 aa = RND IF aa < 1E15 THEN GOTO 2 aa = SQR(2 * LOG(aa)) bb = RND y = muY + aa * COS(2 * pi * bb) sy = sy + y / nY: syy = syy + y * y NEXT i xhat = sX: yhat = sy ssdX = sxx  nX * xhat * xhat ssdY = syy  nY * yhat * yhat ssd = ssdX + ssdY R = T * SQR(ssd * n) dd = (xhat  yhat)  R IF dd > 0 THEN inn = inn + 1 / ali NEXT j PRINT USING "##.# ###.# % "; muX; inn * 100; inn = 0 NEXT muX END



