Date: Aug 9, 2005 8:38 AM
Author: marquito
Subject: the math in classify.m
Hi!

In my thesis I'm using to classify data the Matlab 'classify'

function with linear discrimination. To see what the math behind it

is, I looked into the code and found this:

=====================================================

% Pooled estimate of covariance

[Q,R] = qr(training - gmeans(gindex,:), 0);

R = R / sqrt(n - ngroups); % SigmaHat = R'*R

s = svd(R);

if any(s <= eps^(3/4)*max(s))

error('The pooled covariance matrix of TRAINING

must be positive definite.');

end

% MVN relative log posterior density, by group, for

each sample

for k = 1:ngroups

A = (sample - repmat(gmeans(k,:), mm, 1)) / R;

D(:,k) = log(prior(k)) - .5*sum(A .* A, 2);

end

======================================================

I dont know exactly, was is going on there. I expected to see

something like a multivariate Gauss distribution, like:

p(x|class) = 1/sqrt(2pi^d * |Sigma| ) * exp( (x-mu)Sigma^-1(x-mu))

or something similar to this. Could somebody verify this or explain

what kind of magic the programmer used (why qr decomposition?).

I'm also very interested how 'D' is calculated since I use this value

to show the distances of a sample to the different classes. Shouldn't

this be the probability density function of x for the different

classes?

Thanks in advance!