I realized the sloppiness as well. Nevertheless philosophically I don't understand what is "actual pre-knowledge" and "infinite pre-knowlege". Could you elaborate on that? Is there a difference if my hypotheses are coming from a constrained set or from a set of all computable distributions?
On Monday, February 18, 2013 3:16:59 PM UTC+1, David Jones wrote: > "Cagdas Ozgenc" wrote in message > > news:firstname.lastname@example.org... > > > > Hello, > > > > I am confused with the usage of Bayes with model selection. > > > > I frequently see the following notation: > > > > P(H | D) = P(D | H)*P(H) / P(D) where H is hypothesis and D is data. > > > > It's Bayes rule. What I don't understand is the following. If in reality D ~ > > N(m,v) and my hypothesis is that D ~ (m',v) where m is different from m' and > > if all hypothesis are equally likely > > > > P(D) = sum P(D|H)*P(H)dH is not equal to true P(D), or is it? > > > > ======================================================================= > > > > The standard notation is sloppy notation. If you use "K" to represent what > > is known before observing data "D", then > > > > P(H | D,K) = P(D | H,K)*P(H|K) / P(D|K) > > > > and then go on as you were, you get > > > > P(D |K) = sum P(D|H,K)*P(H|K) dH > > > > ... which at least illustrates your concern. > > > > "True P(D)" can be thought of as P(D | infinite pre-knowledge), while Bayes' > > Rule requires P(D |K)=P(D |actual pre-knowledge). > > > > David Jones