Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Difference of Two Population means Reproducibility
Replies: 0

 Search Thread: Advanced Search

 Luis A. Afonso Posts: 4,758 From: LIsbon (Portugal) Registered: 2/16/05
Difference of Two Population means Reproducibility
Posted: Dec 11, 2012 5:29 PM
 Plain Text Reply

Difference of Two Population means Reproducibility (Repeat detection suitability)

Introduction: depending on the conditions through a NHST we are able to find or not a difference between two normal Populations (same standard deviation, sigma) with different mean values. Simulation (Box-Muller) results were made to illustrate this well-known fact.
____n= common sample size: nX=nY
____0.5, 1.0, 1.5, 2.0 = muX , muY=0.
____% of values (out of 40´000) that is outside the 95% interval [0, left bound]. Then, it not includes 0.

__left bound = T(.975, df= nX+nY-2) multiplied by
sqrt[ (ssdX+ssdY)/(nX+ nY -2) * (1/nX+1/nY) ].

ssdX = sum squares deviations, sample X, size nX, simulate as it was drawn from X~N(muX, sigmaX).
(similarly for Y~N(muY, sigmaY))
The sample sizes were set 10, 15, 20, 25, 30 and T = 2.101, 2.048, 2.024, 2.011, 2.002 respectively.

Results:

___nX=nY____sigmaX=sigmaY=1_______/40´000

__n=10, T=2.101(df=18)
__0.5__18.3%__1.0__56.4__1.5__88.7__2.0__98.9_

__n=15, T=2.048 (28)
__0.5__26.3%__1.0__75.8__1.5__97.8__2.0_100.0_

__n=20, T=2.024 (38)
__0.5__33.5%__1.0__86.9__1.5__99.6__2.0_100.0_

__n=25, T=2.011 (48)
__0.5__41.0%__1.0__93.5__1.5_100.0__2.0_100.0_

__n=30, T=2.002 (58)
__0.5__47.7%__1.0__96.8__1.5_100.0__2.0_100.0_

___n=30_______sigmaX=sigmaY=2_______
__0.5__15.3%__1.0__47.0__1.5__81.2__2.0__96.7_

Large sizes and small Population standard deviations allows to detect small mean differences, in contrast if sizes are small and stdev large only large differences are completely detectable: see two later examples.
(all people are aware. . .).

Luis A. Afonso

REM "Tduo"
CLS
DEFDBL A-Z
pi = 4 * ATN(1)
PRINT
PRINT " *********** Tduo:********************"
PRINT " _________________________________________________ "
PRINT " _________________________________________________ "
PRINT
PRINT " X~N(muX,1):nX vs Y~N(0,1):nY difference of means "
PRINT " without previous variance selection by F test "
PRINT " AIM: To find delta significance frequency "
PRINT " d=(Xhat-Yhat)/s"
PRINT " s^2 = [(ssdX+ssdY)/(nX+nY-2)]*(1/nX+1/nY) "
PRINT " T(.975,nX+nY-2)*s < = Xhat-Yhat (right tail) "
PRINT " _________________________________________________ "
PRINT " _________________________________________________ "
PRINT
INPUT " nX, nY "; nX, nY
n = (1 / nX + 1 / nY) / (nX + nY - 2)
INPUT " T(.975,nX+nY-2) "; T
muY = 0
INPUT " ALL "; ali
REM
FOR muX = .5 TO 2 STEP .5
RANDOMIZE TIMER
FOR j = 1 TO ali
sX = 0: sxx = 0: sy = 0: syy = 0
REM DATA
FOR i = 1 TO nX
1 aa = RND
IF aa < 1E-15 THEN GOTO 1
aa = SQR(-2 * LOG(aa))
bb = RND
x = muX + aa * COS(2 * pi * bb)
sX = sX + x / nX: sxx = sxx + x * x
NEXT i
FOR i = 1 TO nY
2 aa = RND
IF aa < 1E-15 THEN GOTO 2
aa = SQR(-2 * LOG(aa))
bb = RND
y = muY + aa * COS(2 * pi * bb)
sy = sy + y / nY: syy = syy + y * y
NEXT i
xhat = sX: yhat = sy
ssdX = sxx - nX * xhat * xhat
ssdY = syy - nY * yhat * yhat
ssd = ssdX + ssdY
R = T * SQR(ssd * n)
dd = (xhat - yhat) - R
IF dd > 0 THEN inn = inn + 1 / ali
NEXT j
PRINT USING "##.# ###.# % "; muX; inn * 100;
inn = 0
NEXT muX
END

© The Math Forum at NCTM 1994-2018. All Rights Reserved.