Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: different priors (flat, uniform, etc)
Replies: 33   Last Post: Nov 5, 2006 7:50 AM

 Messages: [ Previous | Next ]
 Anon. Posts: 379 Registered: 6/2/05
Re: different priors (flat, uniform, etc)
Posted: Nov 5, 2006 7:50 AM

Herman Rubin wrote:
> In article <_u%0h.40667\$nQ2.12889@reader1.news.jippii.net>,
> Anon. <bob.ohara@NOSPAMhelsinki.fi> wrote:

>> Reef Fish wrote:
>>> David Winsemius wrote:
>>>> "Reef Fish" <large_nassua_grouper@yahoo.com> wrote in

>
>>>>> For Bayesian Inference on the parameter p of a Binomial distribution
>>>>> or a Bernoulli Process, the beta distribution is a member of the
>>>>> conjugate prior family -- meaning both the prior AND posterior
>>>>> belongs to the same distribution family -- Beta.

>
>>>>> The uniform distribution on (0,1) is a Beta distribution with
>>>>> parameters (1,1) and is an INFORMATIVE prior.

>>>> Can we hear a bit more about how is Beta(1,1) is an informative prior for a
>>>> binomial problem?

>
>>> It CHANGES the likelihood function to form the posterior distr.
>
>> But what does this mean? I guess you could mean something similar to
>> the way Fisher treated likelihood: he waved his Fiducial wand, and the
>> conditioning magically reversed. Of course, the Bayesian version does
>> this formally.

>
>> The problem with this interpretation is that any prior will have the
>> same effect, so there would be no such thing as a non-informative prior.
>> As non-informative priors do exist, and are discussed in the
>> literature, they do exist.

>
> Is there such a thing as a non-informative prior? I see no
> justification for such, and good reasons not to use such.
>

I take a descriptive approach to definitions, so there is such a thing
as a non-informative prior, simply because people use the term. Whether
they should is a matter that could be discussed endlessly, and it
certainly wasn't my aim to take a firm stand either way in this thread.

> For some problems, invariant priors are used, with the best
> invariant prior being the right invariant Haar measure for
> the transformation group. Priors should be looked upon as
> weight functions, rather than belief, and hence can have an
> infinite integral. The usual argument given for invariant
> priors is that if one has a location problem, it matters not
> where the origin is located, or if one has a scale problem,
> the units do not matter.
>
> Now it is correct that the same results should be obtained
> if the units are inches or meters, but this does not mean
> that the inference should be the same if the numbers given
> are the same. There are invariant problems in which there
> are priors giving uniformly better results than invariant
> priors, and these are not "unusual"; estimating the
> covariance matrix of a multivariate normal is there already.
>

>> Non-informative priors are generally defined as priors which only add a
>> small amount of information, as compared to the likelihood. How does
>> the beta(1,1) shape up?

>
> What does this mean? If the sample size is large enough,
> and the dimension is small enough, and the prior is "smooth",
> it makes essentially no difference.
>

Indeed: but of course that isn't always the case, and I was trying to
pin down a specific comment by Reef Fish.

>> For the binomial, the likelihood (up to a normalising constant) is:
>
>> L(p| r) = p^n (1-p)^(N-n)
>
>> The pdf of a beta distribution is:
>
>> P(p) = K p^(alpha-1) (1-p)^(beta-1)
>
>> (where K is a normalising constant) so the posterior is
>
>> P(P|r) = K_p p^(n+alpha-1) (1-p)^(N-n+beta-1)
>
>> For a beta(1,1), this becomes:
>
>> P(P|r) = K_p p^(n) (1-p)^(N-n)
>
>> i.e. algebraically the same as the likelihood. In other words, it
>> doesn't add any information to the likelihood. This is pretty much
>> definitive of a "non-informative prior".

>
> So should one use a beta(1,1) or a beta(.5,.5) or a beta(0,0)?
> This latter would use the density 1/(p - p^2), which is the
> reciprocal of the information? This and its square root have
> been suggested, and in the case of an invariant problem, will
> automatically give an invariant procedure, which may be quite
> bad throughout the parameter space.
>

So, the invariant approach may not be the best in all cases. I guess
almost any "non-informative", "vague", "objective" approach to
developing priors will break down in some circumstances.

Bob

--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax: +358-9-191 51400
WWW: http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org

Date Subject Author
10/27/06 wtplasar@lg.ehu.es
10/27/06 John Uebersax
10/27/06 Reef Fish
10/27/06 Anon.
10/27/06 Reef Fish
10/28/06 Anon.
10/28/06 Reef Fish
10/28/06 Anon.
10/28/06 David Winsemius
10/28/06 Reef Fish
10/28/06 David Winsemius
10/28/06 Reef Fish
10/29/06 David Winsemius
10/29/06 Anon.
10/29/06 DZ
10/29/06 Reef Fish
10/29/06 R. Martin
10/29/06 Reef Fish
11/1/06 Herman Rubin
11/5/06 Anon.
10/28/06 Herman Rubin
10/29/06 David Winsemius
10/29/06 Reef Fish
10/28/06 Herman Rubin
10/30/06 John Uebersax
10/30/06 Reef Fish
11/1/06 Herman Rubin
10/28/06 John Uebersax
10/28/06 Reef Fish
10/28/06 Herman Rubin
10/29/06 Reef Fish
10/30/06 John Uebersax
11/1/06 Herman Rubin
10/28/06 Herman Rubin