Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.


Luis A. Afonso
Posts:
4,725
From:
LIsbon (Portugal)
Registered:
2/16/05


Two Normal samples difference
Posted:
Oct 12, 2013 12:36 PM


Two Normal samples difference
The theory behind can be found at (Pooling variance): http://www.stat.yale.edu/Courses/199798/101/meancomp.htm
Sample comparison
Few persons are comfortable with to find out that two normal samples were originated from Populations of significant nodifferent mean values. Note that the above designation is far preferable to means equality because this event, if so, is very rare and even though it occurs we are unable to state it really exists. In fact we are rather more interested to estimate what is the difference d= muX  muY. Formally we can found though the Confidence Interval W= [u, v] for the Population difference means d __ contains Zero___we cannot assure d>0 __ v < d ___Population difference does not reach d, __ u > d___ difference at least equal to d __with probability 1alpha. We are dealing with the difference of two Normal samples, same variance, X= N(a, sigma): n, Y= N(0, sigma): m The testbook standard formula relative to the test is t = (xbar ybar  d) +/ Sp * sqrt (1/n + 1/m) Sp= sqrt [((n1)*sX2 + (m1)*sY2))/ (n+m2)] The 95% Confidence Interval for d u= (xbar  ybar) + t(.025, n + m  2)* Sp v= (xbar  ybar) + t(.975, n + m  2)* Sp Where Sp^2 = [(ssdX+ ssdY]/(n+ m 2)*[1/n + 1/m] Now we can separate the total sum of squared differences: ssd= ssdX + ssdY, sample variable following a Chisquare, n + m  2 degrees of freedom from the constant C(n, m)= (1/n+1/m)/(n+ m2). With no loss of generality let be m = k*n, k<=1. Therefore to Y is the smaller sized sample. Let us focuses at f(n; k) = (1/n + 1/(k*n)) /( n + k*n  2) = [(1/n) * (1 + 1/k)] / [n* (1 + k  2/n)] f(n; k=1) = C(n, n)=(1/(n^2)*(1/(11/n)) when the samples sizes are both of size n. Then t = (xbar  ybar) +/ +/ sqrt(ssd) * 1/n * sqrt(1/ (1 1/n)) and the Confidence interval, 5% significance, for the difference of means Population is [u, v] where the bounds are respectively u= (xbar  ybar) + t(.025, n + n  2)* 1/ n * sqrt(ssd)* sqrt(1/ (1 1/n)) v= (xbar  ybar) + t(.975, n + n  2)* 1/ n * sqrt(ssd)* sqrt(1/ (1 1 / n)), The latter factor converging fast to 1 and the Student variables to z(.025) = 1.96, z(.975) = 1.96. __________
Results (routine <POOL>)
TABLE: Fractions of samples containing muX, and ssd inside C.I. when t (Student) and Chisquare critical values (95%) are used as their boundaries (40´000 samples each size).
______ n=25_____30_____40_____ 50___ t______0.9516__0.9502__0.9500__0.9515 Chi2___0.9495_ 0.9518__0.9520__0.9497 ______ n=70____100_____ t______0.9505__0.9497___ Chi2__ 0.9488__0.9498___
As expected the fractions are very close to the proposed 0.95 value.
Luis A. Afonso



