Few persons are comfortable with to find out that two normal samples were originated from Populations of significant no-different mean values. Note that the above designation is far preferable to means equality because this event, if so, is very rare and even though it occurs we are unable to state it really exists. In fact we are rather more interested to estimate what is the difference d= muX - muY. Formally we can found though the Confidence Interval W= [u, v] for the Population difference means d __ contains Zero___we cannot assure d>0 __ v < d ___Population difference does not reach d, __ u > d___ difference at least equal to d __with probability 1-alpha. We are dealing with the difference of two Normal samples, same variance, X= N(a, sigma): n, Y= N(0, sigma): m The test-book standard formula relative to the test is t = (xbar- ybar - d) +/- Sp * sqrt (1/n + 1/m) Sp= sqrt [((n-1)*sX2 + (m-1)*sY2))/ (n+m-2)] The 95% Confidence Interval for d u= (xbar - ybar) + t(.025, n + m - 2)* Sp v= (xbar - ybar) + t(.975, n + m - 2)* Sp Where Sp^2 = [(ssdX+ ssdY]/(n+ m- 2)*[1/n + 1/m] Now we can separate the total sum of squared differences: ssd= ssdX + ssdY, sample variable following a Chi-square, n + m - 2 degrees of freedom from the constant C(n, m)= (1/n+1/m)/(n+ m-2). With no loss of generality let be m = k*n, k<=1. Therefore to Y is the smaller sized sample. Let us focuses at f(n; k) = (1/n + 1/(k*n)) /( n + k*n - 2) = [(1/n) * (1 + 1/k)] / [n* (1 + k - 2/n)] f(n; k=1) = C(n, n)=(1/(n^2)*(1/(1-1/n)) when the samples sizes are both of size n. Then t = (xbar - ybar) +/- +/- sqrt(ssd) * 1/n * sqrt(1/ (1- 1/n)) and the Confidence interval, 5% significance, for the difference of means Population is [u, v] where the bounds are respectively u= (xbar - ybar) + t(.025, n + n - 2)* 1/ n * sqrt(ssd)* sqrt(1/ (1- 1/n)) v= (xbar - ybar) + t(.975, n + n - 2)* 1/ n * sqrt(ssd)* sqrt(1/ (1- 1 / n)), The latter factor converging fast to 1 and the Student variables to z(.025) = -1.96, z(.975) = 1.96. __________
Results (routine <POOL>)
TABLE: Fractions of samples containing muX, and ssd inside C.I. when t (Student) and Chi-square critical values (95%) are used as their boundaries (40´000 samples each size).