The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* » sci.stat.math

Topic: Two Normal samples difference
Replies: 2   Last Post: Oct 13, 2013 2:13 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Luis A. Afonso

Posts: 4,758
From: LIsbon (Portugal)
Registered: 2/16/05
Two Normal samples difference
Posted: Oct 12, 2013 12:36 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Two Normal samples difference

The theory behind can be found at (Pooling variance):

Sample comparison

Few persons are comfortable with to find out that two normal samples were originated from Populations of significant no-different mean values.
Note that the above designation is far preferable to means equality because this event, if so, is very rare and even though it occurs we are unable to state it really exists.
In fact we are rather more interested to estimate what is the difference d= muX - muY. Formally we can found though the Confidence Interval W= [u, v] for the Population difference means d
__ contains Zero___we cannot assure d>0
__ v < d ___Population difference does not reach d,
__ u > d___ difference at least equal to d
__with probability 1-alpha.
We are dealing with the difference of two Normal samples, same variance, X= N(a, sigma): n, Y= N(0, sigma): m The test-book standard formula relative to the test is
t = (xbar- ybar - d) +/- Sp * sqrt (1/n + 1/m)
Sp= sqrt [((n-1)*sX2 + (m-1)*sY2))/ (n+m-2)]
The 95% Confidence Interval for d
u= (xbar - ybar) + t(.025, n + m - 2)* Sp
v= (xbar - ybar) + t(.975, n + m - 2)* Sp
Sp^2 = [(ssdX+ ssdY]/(n+ m- 2)*[1/n + 1/m]
Now we can separate the total sum of squared differences:
ssd= ssdX + ssdY, sample variable following a Chi-square, n + m - 2 degrees of freedom from the constant C(n, m)= (1/n+1/m)/(n+ m-2).
With no loss of generality let be m = k*n, k<=1. Therefore to Y is the smaller sized sample.
Let us focuses at
f(n; k) = (1/n + 1/(k*n)) /( n + k*n - 2) =
[(1/n) * (1 + 1/k)] / [n* (1 + k - 2/n)]
f(n; k=1) = C(n, n)=(1/(n^2)*(1/(1-1/n)) when the samples sizes are both of size n.
t = (xbar - ybar) +/-
+/- sqrt(ssd) * 1/n * sqrt(1/ (1- 1/n))
and the Confidence interval, 5% significance, for the difference of means Population is [u, v] where the bounds are respectively
u= (xbar - ybar) + t(.025, n + n - 2)* 1/ n *
sqrt(ssd)* sqrt(1/ (1- 1/n))
v= (xbar - ybar) + t(.975, n + n - 2)* 1/ n *
sqrt(ssd)* sqrt(1/ (1- 1 / n)),
The latter factor converging fast to 1 and the Student variables to z(.025) = -1.96, z(.975) = 1.96.

Results (routine <POOL>)

TABLE: Fractions of samples containing muX, and ssd inside C.I. when t (Student) and Chi-square critical values (95%) are used as their boundaries (40´000 samples each size).

______ n=25_____30_____40_____ 50___
Chi2___0.9495_ 0.9518__0.9520__0.9497
______ n=70____100_____
Chi2__ 0.9488__0.9498___

As expected the fractions are very close to the proposed 0.95 value.

Luis A. Afonso

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2017. All Rights Reserved.