Central limit theorem dictates that the sum of the i.i.d random variables would approaches normal distribution.
"Joel Daniels" <email@example.com> wrote in message news:firstname.lastname@example.org... > Greetings All, > > I am taking an introductory statistics course and I am struggling with > the concept of hypothesis testing. I understand all the steps and can > blindly perform the task, but I do not understand what exactly the > result means and why anybody would care to find it. > > I have a myriad of questions, but I think I found one concrete way to > address my confusion: > > I'll start with a textbook-style problem: > > A 1-value sample "x" is taken a random variable "z" that follows a > normal distribution with standard deviation 1, but unknown expectation > value (mean). > > Test (with a 90% confidence level) the null hypothesis > H_0: mean of z = 0 > against the alternative hypothesis that > H_A: mean of z <> 0 > > To solve this I need to define a "critical zone," such that the > probability that a random sample of z falls in this critical zone is > 10%. For some reason that I do not fully understand, I can further > stipulate that the "appropriate" critical zone is divided equally > between the two extremes. > So the critical zone is roughly > > (-infinity,-2) U (2, infinity) > > Next, I check to see if my random sample x falls within the critical > zone. If it does then I reject the null hypothesis, otherwise I fail > to reject the null hypothesis. > > Great, but why this choice of critical zone? I understand why 10% of > the area under the p.d.f of "z" must lie in the critical zone, but I do > not understand why this zone should be equally distributed between both > extremes. I know that this "rule" depends on the distribution. For > example imagine that I repeated the problem above, but with a p.d.f for > z that looks like a capital "M" with each vertical bar of the "M" 1 > unit away from the mean. Formally this distribution is: > > For z < mean - 1 ....... p.d.f(z)=0 > For mean -1 <= z <= mean + 1 ....... p.d.f(z)=abs(z-mean) > For mean + 1 < z ..... p.d.f(z)=0 > > In this case, I know that my sample value "x" CANNOT be 0, if the mean > of z is 0. In other words, if I see a sample value of exactly 0, then > I know 100% for sure that my null hypothesis is false. Logically, this > implies that I should be suspicious if I see a value of 0.00012698... > Yet if my critical zone is the area near the extremes of my function, > then I cannot reject my null hypothesis on a sample value of > 0.00012698, even at the 10% level! > > I _think_ that the truth is that you want the critical zone to be some > zone such that for all "a" within the zone, and for all "b" outside if > it this inequality holds true: > > p.d.f (a) <= p.d.f(b) > > This leaves me stuck if I have a uniform distribution... but I am OK > with that, because I can't envision how a standard hypothesis on a > uniform distribution could tell you anything interesting anyway. > > I came up with this "rule" for the critical zone when the alternative > hypothesis is > > H_A: mean of z <> 0 > > based on intuition. Is it right? If so, can anyone help me prove it, > or even write a simulation that shows its validity? Every time I try, > I get stuck because I don't have any idea what a hypothesis test really > says in the first place, so I can't prove that methodology X gives my > value Y, when I don't know what the definition of Y is. > > Any help would be greatly appreciated. > > Thanks, > Joel Daniels > >