Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: Adjustment of alpha to sample size (Prof Rubin, Prof Koopman et al., could you help this amateur again?)
Replies: 2   Last Post: Mar 4, 2013 5:31 PM

 Messages: [ Previous | Next ]
 Gaj Vidmar Posts: 21 Registered: 12/13/04
Re: Adjustment of alpha to sample size (Prof Rubin, Prof Koopman et al., could you help this amateur again?)
Posted: Mar 4, 2013 5:31 PM

Dear Professor Rubin,

I will try to study your paper and your explanation, or at least urge my
more mathematically apt and Bayesian colleagues to do so (so that they can
benefit from your ideas and explain things to me).

In my current paper, I will make sure that the wording does not in any way
imply that the alpha adjustment is Bayesian, just that it is related to a
Bayesian line of thought.

By the way, though I've got the skills and the habit to search for
references far and wide, I haven't come across anything in outlier detection
that would resemble my approach. So if you can give me a hint where to look
for that 19th century proposal that you mention, I will be extra grateful
and honoured.

Best regards,
Gaj Vidmar

"Herman Rubin" wrote in news:slrnkj1vtf.9uk.hrubin@skew.stat.purdue.edu ...

On 2013-02-28, Gaj Vidmar <gaj.vidmar@guest.arnes.si> wrote:
> size.

> I'm using it (broadly speaking, to roughly clarify) in the field of
> SPC/SQC
> ("eclectic", "pragmatic", i.e., anything goes that allegedly works :)
> My sort-of-control-chart invloves prediction interval from regression
> (through origin, if anyone remembers or wants to search for the original
> post),
> where I adjust the confidence level (i.e., 1-alpha) so that there is
> always
> one half of a point to be expected outside the limits
> (maybe bizzare or silly, maybe not - let's leave that aside). So my
> formula
> is

> (1-aplha) = (n-0.5)/n

This was proposed in the 19th century to test for outliers.
The argument was not in any way Bayesian.

> At first, I had called this approach (half seriously) idiot's FDR, but
> convinced me that albeit not entirely unrelated (as a concept),
> FDR is always and only about multiple test (to put it simply).
> Fortunately, I've found three Bayesian references that
> (at least my blockheadness guesses so)
> argue for this type of thinking (listed below; the closest to something I
> can at least
> partly understand is #3). Should be enough, especially because Bayesian is
> even
> more "in" than FDR (although I'm even more clueless about it, but the
> referees
> don't know that, and I'm skilled at conning scientific journal readers :o)

> However, in the newsgroup, Prof Herman Rubin had written that "The level
> should
> decrease with increasing sample size. In low dimensional problems,
> with the cost of incorrect acceptance going as the k-th power of the
> error,
> the rate at which the level decreases should be about 1/n^((d+k/2)."

> So it would be wonderful if one of you wizzards came up with some "simple
> algebra" (as it's so often said in the literature when the opposite is
> true
> for mere mortals) that relates Prof Rubin's formula to mine!

> I can guess that "the level" refers to alpha and that my problem is
> low dimensional (1D). "Incorrect acceptance" most probably means
> "incorrect acceptance of the null hypothesis".
> And let's say I take k to be 2 (by analogy with the notorious Taguchi,
> or because quadratic loss sounds familiar to many people in many fields).
> So far so good; but what does d stand for??

I suggest you read my paper with Sethuraman, "Bayes risk efficiency",
in _Sankhya_ 1965, pp. 325-346. Here d is the number of degrees
of freedom for the alternative, such as testing a p-dimensional
null in a p+d-dimensinal parameter space, and k is the exponent
in the Type II loss. One is not restricted to the Bayes procedure,
but a prior Bayes risk minimization for a given test procedure is
considered. The rate can have factors whose logarithm is o(ln(n)),
and convergence is not great.

A simple example is testing that the mean of a normal distribution
with identity covariance matrix in 2 dimensions is 0, with a
constant Type II loss and constant prior density for the alternative;
do not worry about the infinite integral. The probability of
rejecting if the mean square of the observations exceeds 2K is
exp(-nK), and the integrated probability of a Type II error is
a multiple of K. So the prior Bayes risk is Aexp(-nK) + BK,
and this is minimized if exp(-nK) = B/nA. So in this case the
p-value should be B/nA. Other cases do not come out in nice
closed form.

> Fortunately (or un-, from a broader perspective) my paper (currently under
> review)
> should get accepted even without such ellegant justification. And all I
> can
> do to
> return the favour is an acknowledgement . (Needless to say, co-authorship
> is not something that the two Profs I mention in the title -- and other
> newsgroup
> "heavyweights", especially the retired ones -- want or need, anyway).

> So, as usual, thanks in advance for any help.

> Gaj Vidmar

> References:
> 1. Seidenfeld T, Schervish MJ, Kadane JB. Decisions without ordering.
> In "Acting and Reflecting", Sieg W (ed). Kluwer: Dordrecht, 1990, 143-170.
> 2. Berry S, Viele K. Adjusting the alpha-level for sample size.
> Carnegie Mellon University Department of Statistics Technical Report 1995;
> 635.
> http://www.stat.cmu.edu/tr/tr635/tr635.ps
> 3. Berry S, Viele K. A note on hypothesis testing with random sample sizes
> and its
> relationship to Bayes factors. Journal of Data Science 2008; 6(1): 75-87.

--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Date Subject Author
2/28/13 Gaj Vidmar
3/1/13 Herman Rubin
3/4/13 Gaj Vidmar