Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.


Paul
Posts:
162
Registered:
12/7/09


Re: The probability of getting zero counts
Posted:
Oct 16, 2010 11:06 AM


On Oct 15, 1:04 am, Richard Wright <richwrigREM...@tig.com.au> wrote: > Consider a series of squares randomly placed over a research area. > > The number of stones within each square is counted. > > Ten out of 12 squares contain stones, in varying numbers. > > Two of the 12 squares contain zero stones. > > Person A argues that the fact there are two squares with zero stones > shows that the distribution is especially patchy in those spots. > Person B, by contrast, argues that one must expect to get some zero > counts, given the variability in numbers seen in the squares that have > stones. > > How might one choose between these two claims? It seems to me that we > must ask the following question. > > Given the variability of numbers in the 10 squares that have stones, > what is the probability of obtaining two squares with zero stones if > we assume the variability is uniform over the area sampled? > > To my mind this question relates to confidence intervals, but I can't > see how to harness them. > > I would welcome any suggestions about how one might approach this > question of statistical inference.
The problem with Person B's argument is that it might be possible that the number of stones in a "typical" square has a distribution whose support excludes zero (for instance, U{1,...N} where N is the largest number of stones observed in any square). It will always be possible to conjure up a distribution with strictly positive support that reasonably accounts for the variability in the nonzero observations.
If I were B, I might try the following. First, fit a discrete distribution to the nonzero portion of the sample. If the stones are not too thick upon the ground, I'd try a Poisson distribution first, recognizing that if the support of the true distribution contains zero then I'm biasing the parameter(s) of my fitted distribution by excluding the zero observations. (The bias will make my test more conservative.) Now compute the probability of two or more empty squares out of 12 using the fitted distribution. If that probability is above my threshold for Type I error, I refuse to reject the hypothesis that the two empty squares share the same distribution as the 10 nonempty squares.
/Paul



