|
|
Re: Probabilities always >= 0 and <= 1?
Posted:
May 8, 2012 10:01 AM
|
|
> On May 8, 10:35 am, FFMG <spambuc...@myoddweb.com> > wrote: > > Hi, > > > > I was looking at a a site, > (http://bionicspirit.com/blog/2012/02/09/howto-build-n > aive-bayes-class...), basically talking about a Naive > Bayes Classifier. > > > > But in some cases the formula gives me > probabilities greater than 1. > > How it is possible? > > > > // Total of 18 documents. > > // * 9 documents out of a total of 18 are spam > messages > > // * 3 documents out of those 18 contain the word > "naughty" > > // * 3 documents containing the word "naughty" > have been marked as spam > > // * 3 documents out of the total contain the > word "money" > > // * 3 emails out of those have been marked as > spam > > > > P(spam|naughty,money) = P(money|spam) * > P(money|spam) * P(spam) > > > -------------------------- > > P(naughty) * > P(money) > > > > P(spam|naughty,money) = 3/9 * 3/9 * 9/18 = 2 > > ---------------- > > 3/18 * 3/18 > > > > But how can a probability be outside of 0 and 1? > Must I always force the numbers to be between 0 and 1 > and accept that in some cases they will fall outside > the range? > > > > Many thanks for suggestions as to where I might > have gone wrong. > > > > Regards, > > > > FFMG > > You have made an incorrect independence assumption. > As both "naughty" > and "money" are only present in "spam" documents, > which form half o > the total number of documents, they are dependent > variables. But, you > calculate p(e) as p(money) * p(naughty) which is > assuming that the > variables are independent. Hence your problem.
You're joking, right? Making the kind of mistake you describe could not result in getting a probablilty greater than one - he's just multiplyng numbers that are all less than one. Could the mistake be in the calculation 3/9*3/9*9/18=2? hmmmmm... FFMG, If your question was sincere, the first thing to address is to learn how to multiply fractions: a/b*c/d= ab/cd. So 3/9*3/9*9/18=9/81*9/18= 81/(81*18)=1/18.
|
|