%% Display the results clc; disp('______________________________________ Results ______________________________________________________'); disp(' '); disp(sprintf('Resubstitution Error of LDA (Training Error calculated by Matlab build-in): %d', ldaResubErr)); disp(sprintf('Resubstitution Error of LDA (Training Error calculated manually): %d', ldaResubErr2)); disp(' '); disp('Confusion Matrix:'); disp(ldaResubCM) disp(sprintf('Cross Validation Error of LDA (Leave One Out): %d', ldaCVErr)); disp(' '); disp('______________________________________________________________________________________________________');
I. My first question is how to do a feature selection? For example, using forward or backward feature selection, and t-test based methods?
I have checked that the Matlab has got the `sequentialfs` method but not sure how to incorporate it into my codes.
II. How do using the Matlab `classify` method to do a classification with more than 2 features? Should we perform the PCA at first? For example, currently we have 11 features, and we run PCA to produce 2 or 3 PCs and then run the classification? (I am expecting to write a loop to add each feature one by one to do a forward feature selection. Not just run PCA to do a dimension reduciton.)
III. I have also try to run a ROC analysis. I refer to the webpage [enter link description here] which has got an implementation of a simple LDA method and produce the linear scores of the LDA. Then we can use `perfcurve` to get the ROC curve.
IIIa. However, I am not sure how to use `classify` method with `perfcurve` to get the ROC.
IIIb. Also, how to do a ROC with the cross-validation?
IIIc. After we have got the `OPTROCPT`, which is the best cut-off point, how can we use this cut-off point to produce better classification?
% Calculate linear discriminant coefficients ldaCoefficients = LDA(featureSelcted, groundTruthNumericalLable);
% Calulcate linear scores for the training data ldaLinearScores = [ones(numFeatures,1) featureSelcted] * ldaCoefficients';
% Calculate class probabilities classProbabilities = exp(ldaLinearScores) ./ repmat(sum(exp(ldaLinearScores),2),[1 2]);
% Fit probabilities for scores figure, [FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthNumericalLable(:,1), classProbabilities(:,1), 0); plot(FPR, TPR, 'or-') xlabel('False positive rate (FPR, 1-Specificity)'); ylabel('True positive rate (TPR, Sensitivity)') title('ROC for classification by LDA') grid on;
IV. Currently, I calculate the accuracy of the training and cross validation errors by the classify and `crossval` functions. May I ask how to get those values in a summary by using `classperf`?
V. If anyone knows a good tutorial of using Matlab statistic toolbox to do machine learning task with a full example please tell me.
Some Matlab Help examples are really confusing to me because the examples are made in pieces and I am really a novice to machine learning. Sorry if I asked some question bot proper. Thanks very much for your help.