>On Apr 22, 2:42 am, Lynne Vickson <clvick...@gmail.com> wrote: >> On Apr 19, 4:36 pm, "analys...@hotmail.com" <analys...@hotmail.com> >> wrote: >> >> > Although it seems elementary, I am not aware that standard textbooks >> > treat this problem. >> >> > There is a universal set U of N distinct objects. A fixed subset S of >> > n distinct objects is chosen from it (0 < n < N). >> >> > Another subset T of m (0 < m < N) distinct objects is then chosen from >> > U. The question is what is the probability distribution of the >> > cardinality of S intersection T. N may be considered to be infinity, >> > although m/N and n/N are not vanishingly small. >> >> If N is finite and the choice of the m objects comprising T is >> "random", the cardinality of the intersection >> has a hypergeometric distribution. (The hypergeometric distribution >> gives the probability of k type 1 objects >> when m objects are chosen without replacement from a population of N1 >> type 1 and N2 type 2 objects; in >> your problem, N1 = n, N2 = N-n and you are asking how many objects in >> the random set T are type 1.) If >> N is "infinite" but n/N is nonzero, and if you pick a FINITE number m >> of objects, you now have the binomial limit of the >> hypergeometric, so the cardinality of T intersect S has the binomial >> distribution with parameters m and p = n/N. >> >> RGV > >Thanks. I am looking at a contingency tables problem. Let x(i,j) = >observed count in row i and column j. r(i) = row sum of row i and >c(j) = column sum of column j and G = grand total count. Typically >r(i).c(j)/G is comapred to x(i,j) to test for interaction between rows >and columns. There seems to be an implied "binomial approximation" >here and now its clear exactly whats going on. > >I have another question: Any particluar cell may over- or under- >perform with respect to the expected value under the null hypothesis >of no interaction. Are there one-sided tests for particular cells and >groups of cells to test for over/under performance?
I think you want to know if there is a test on the single cells of a contingency table, as one question.
The usual chi-squared with k d.f. can be the result of the sum of k independent 1 d.f. chi-squared variates. For a contingency table, the cells are not independent, but they are chi-squared-like. Under the Poisson derivation of a cintingency table test (that's just one way to derive it), the variance of each cell is equal to the Expected Value.
The usual test is the sum of chi-squared-like contributions from the individual cells, using Expected and Observed, X^2 - sum [ (O-E)^2/E ] ,
Especially for a table that is large, both across and down, where the d.f. approaches the number of cells, each cell can be regarded as a 1 d.f. chisquared. But a chisquared is simply a normal variate, z, squared. So you can take the individual cell contribution as, approximately, z= (O-E)/ sqrt(E), for a one-tailed test.
I have only ever used that in a very casual way. I think that there are slightly different versions available.
For "several cells" -- I wonder what you are going after. It is possible to "partition" the contingency table into several separate tests. When the tests are construed as the Likelihood test, rather than the Pearson test, tests can be devised that are independent and additive. This can be useful for testing "linear trend" and so on.