Reef Fish wrote: > firstname.lastname@example.org wrote: >> Hello all, >> >> I have two simple basic questions: >> >> 1- How do I determine if a coin is fair or not. Assuming for example, I >> get 5500 heads out of 10000 coin tosses. > > You can never be sure from such an experiment whether the coin is > fair or not. > > What the statistician does is to ask the question that IF the coin is > fair (so that the probability of each toss is .5 for heads) what's the > probability of observing something MORE EXTREME than what > was actually observed in the sample. This is called the p-value > of the test, and if the p-value is sufficiently small (to be determined > by the statistician as to how small is sufficiently small) then the > statistician would reject the hypothesis that the coin is fair, and > conclude (statistically) that the coin it NOT a fair one. > > >> 2- How do I determine the probability distribution of given data >> points? > > It's actually the probability distribution of the "test statistic", > which > is used to determine whether the coin is fair or not. > > The test statistics could be "n" heads (out of N trials). That > statistic > would follow a binomial distribution (or a Bernoulli Process with p = > 1/2) > under the hypothesis of a fair coin. > > For samples as large as your 10,000, the test statistic n/N or the > sample proportion can be well approximated by a normal distribution, > so that you can use the standard normal deviate Z > > Z = (n/N - .5)/sqrt(.5*.5/10000) > > for the test. When n/N = .55, Z = .05(100/.5) = 10 which is > "astronomically large" for a Z. Hence, you can conclude with > virtual certainty that the coin is NOT a fair coin. > > -- Bob.
I'll just add that in my experience, students sometimes find the chi-square goodness of fit easier to understand for a problem like this. Of course, it is equivalent to the z-test Bob showed: The chi-square test statistic = z-squared.
The chi-square test is based on the discrepancies between observed and expected counts, where the expectation is based on some hypothesis such as, "H0: The coin is fair". For that (null) hypothesis, the expected values of Heads and Tails are both 5000. The observed values are 5500 and 4500.
To get the test statistic, compute the Observed minus Expected difference for each cell, square it, and then divide by Expected. Sum those values across all cells.
X^2 = Sum[(O-E)^2/E] = 50 + 50 = 100 = z^2
Because there are 2 categories, the p-value is obtained from a chi-square distribution with df=1. Note that if you square the standard normal distribution, you get the chi-square distribution with df=1. So the p-value from the chi-square goodness of fit test will be identical to the p-value from the z-test.
-- Bruce Weaver email@example.com www.angelfire.com/wv/bwhomedir