Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: A "plausible range" for a random variable
Replies: 9   Last Post: Jun 11, 2013 7:42 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Richard Ulrich

Posts: 2,860
Registered: 12/13/04
Re: A "plausible range" for a random variable
Posted: Jun 8, 2013 4:03 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Sat, 8 Jun 2013 05:36:04 -0700 (PDT), Sergio Rossi
<deltaquattro@gmail.com> wrote:

>> 1)
>> Selecting Lower and Upper limits, L and U, as you notice, is
>> usually done symmetrically (if not one-tailed). That is done,
>> mainly, for lack of any other good reason to outweigh the
>> equal emphasis on each end.
>>
>> Statistical estimation theory occasionally looks at the
>> "narrowest" CI. That is *the* important characteristic
>> of one-tailed tests, determining UMP (Uniformly Most
>> Powerful). Because tails can be asymmetrical, no two-tailed
>> test is UMP.
>>
>> Decision theory would suggest that you apply a loss-function
>> to determine what degree of asymmetry might apply -- I
>> was intrigued, long ago, by the suggestion that the "power" of
>> standard research might be improved by splitting the conventional
>> 5% into 4% at the "expected" end and 1% at the other end,
>> for a gain in general power without losing the right to report
>> stronger effects in the opposite direction. I read that at least
>> 30 years ago, so you can see that the idea never caught on.
>>
>> 2)
>> A parametric approach to L and U for Extreme Values is not
>> going to be at all efficient. What is used for estimation is what
>> your bootstrapping would converge to: The CI based for L
>> (or U) based on rank-order in the original sample.
>>
>> Poisson consideration gives a good approximation for small
>> proportions. This is applied for your N=2000, 2 1/2%, as follows.
>>
>> Rank 50 is the point estimate of L. The +/- 2SD range for Poisson
>> can be estimated as ( Square(Sqrt(L) - 1), Square(Sqrt(L) + 1) )
>>
>> The square root of 50 is about 7; the square of 6 is 36, and the
>> square of 8 is 64. That gives (approximately) the CI for L=50
>> is (37, 65).
>>
>> --
>> Rich Ulrich

>
>Great! Thanks a lot, Rich, that's precisely what I needed. You mention that this Poisson approximation is valid for small proportions: what about U, i.e., the 97.5-th percentile? This is a "big" proportion, I mean, it's close to 1. Can I still claim that the +/- 2SD CI for U=1950 is given by [(sqrt(U)-1)^2, (sqrt(U)+1)^2]? Or do I need another formula?
>Also, which is the formula for a general C.I., like 90% or 99% or 99.9%, etc.? When using the normal approximation, one substitutes the value 1.96 with the corresponding percentile of the normal distribution, z_{1/2+alpha/2} where alpha={.90, .95, .99,...} etc. I'm not familiar with this Poisson approximation, however, so I don't know how to proceed in this case.
>
>Thanks again,


Consider the 97.5-th percentile to be a 2.5-th percentile
when you count from the other end. So when you take
rank-50 as a point estimate with CI of (37,65), you have
the symmetrical values at the other end of n values --
rank (n-50) with CI (N-65, N-37).

The Poisson is the distribution observed in random counts.
When a proportion is small, you can use Poisson as a good
approximation to what you would get for the Binomial, which
is the more general case of "counts out of a total".

The theory for the CI is that you can estimate the variance
of a transformation of X by taking the right derivative. This
works out as follows. From that estimate, the standard
deviation of the sqrt(Poisson) = 1/2 (approximately).
And the distribution of the sqrt(Poisson) is very close to
normal, once the counts are above a few.

Thus, the +/- 2SD range around the sqrt(Poisson) is the
95% CI, to a pretty good approximation. Or, +/- 1. Then
you square that, to get the (slightly asymmetrical) CI for the
original distribution. You are typically going to round the
final results to integers, since this is the CI of "counts".

When you want some CI other than 95%, two-tailed, you
multiple the SD=1/2 by some other multiplier than 2.0.


--
Rich Ulrich



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.