Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Two Normal samples difference
Replies: 2   Last Post: Oct 13, 2013 2:13 PM

 Messages: [ Previous | Next ]
 Luis A. Afonso Posts: 4,758 From: LIsbon (Portugal) Registered: 2/16/05
Two Normal samples difference
Posted: Oct 12, 2013 12:36 PM

Two Normal samples difference

The theory behind can be found at (Pooling variance):
http://www.stat.yale.edu/Courses/1997-98/101/meancomp.htm

Sample comparison

Few persons are comfortable with to find out that two normal samples were originated from Populations of significant no-different mean values.
Note that the above designation is far preferable to means equality because this event, if so, is very rare and even though it occurs we are unable to state it really exists.
In fact we are rather more interested to estimate what is the difference d= muX - muY. Formally we can found though the Confidence Interval W= [u, v] for the Population difference means d
__ contains Zero___we cannot assure d>0
__ v < d ___Population difference does not reach d,
__ u > d___ difference at least equal to d
__with probability 1-alpha.
We are dealing with the difference of two Normal samples, same variance, X= N(a, sigma): n, Y= N(0, sigma): m The test-book standard formula relative to the test is
t = (xbar- ybar - d) +/- Sp * sqrt (1/n + 1/m)
Sp= sqrt [((n-1)*sX2 + (m-1)*sY2))/ (n+m-2)]
The 95% Confidence Interval for d
u= (xbar - ybar) + t(.025, n + m - 2)* Sp
v= (xbar - ybar) + t(.975, n + m - 2)* Sp
Where
Sp^2 = [(ssdX+ ssdY]/(n+ m- 2)*[1/n + 1/m]
Now we can separate the total sum of squared differences:
ssd= ssdX + ssdY, sample variable following a Chi-square, n + m - 2 degrees of freedom from the constant C(n, m)= (1/n+1/m)/(n+ m-2).
With no loss of generality let be m = k*n, k<=1. Therefore to Y is the smaller sized sample.
Let us focuses at
f(n; k) = (1/n + 1/(k*n)) /( n + k*n - 2) =
[(1/n) * (1 + 1/k)] / [n* (1 + k - 2/n)]
f(n; k=1) = C(n, n)=(1/(n^2)*(1/(1-1/n)) when the samples sizes are both of size n.
Then
t = (xbar - ybar) +/-
+/- sqrt(ssd) * 1/n * sqrt(1/ (1- 1/n))
and the Confidence interval, 5% significance, for the difference of means Population is [u, v] where the bounds are respectively
u= (xbar - ybar) + t(.025, n + n - 2)* 1/ n *
sqrt(ssd)* sqrt(1/ (1- 1/n))
v= (xbar - ybar) + t(.975, n + n - 2)* 1/ n *
sqrt(ssd)* sqrt(1/ (1- 1 / n)),
The latter factor converging fast to 1 and the Student variables to z(.025) = -1.96, z(.975) = 1.96.
__________

Results (routine <POOL>)

TABLE: Fractions of samples containing muX, and ssd inside C.I. when t (Student) and Chi-square critical values (95%) are used as their boundaries (40´000 samples each size).

______ n=25_____30_____40_____ 50___
t______0.9516__0.9502__0.9500__0.9515
Chi2___0.9495_ 0.9518__0.9520__0.9497
______ n=70____100_____
t______0.9505__0.9497___
Chi2__ 0.9488__0.9498___

As expected the fractions are very close to the proposed 0.95 value.

Luis A. Afonso

Date Subject Author
10/12/13 Luis A. Afonso
10/12/13 Luis A. Afonso
10/13/13 Luis A. Afonso