Date: Feb 7, 2013 6:45 AM
Author: Greg Heath
Subject: Re: ANN_Error Goal

"Suresh" wrote in message <kerf2e$sps$1@newscl01ah.mathworks.com>...
> How to decide the value of Error goal while training a neural network for different pattern recognition problems ?

Unfortunately, there is no analytic relationship between the discontinuous classification error rate Nerr/N and continuous error (target-output). Therefore, classifiers are usually trained to minimize the continuous mean-squared-error even though low classification error rate is the ultimate goal.

Subsequently, the same rule is used for regression with O-dimensional targets and classification with O = c classes where the target matrix contains columns of the c-dimensional unit matrix eye(c). In each case the data provides Neq equations

Neq = N*O

to estimate Nw unknown weights. The resulting estimation degree-of-freedom is

Ndof = Neq-Nw

The NAIVE MODEL assumes that the output is a constant equal to the mean of the
target values

y00 = repmat(mean(target'),1,N));
Nw00 = O % size(y00,1)
Ndof00 = Neq-O % (N-1)*O

The resulting biased mse is

MSE00 = sumsqr((target-y00)/Neq
MSE00 = mean(var(target',1)) % Proof for reader

The corresponding unbiased mse that is "a"djusted for the loss of degrees of freedom caused by using the same data to estimate the Nw00 weights is

MSE00a = sumsqr((target-y00)/Ndof00
MSE00a = mean(var(target')) % Proof for reader

For more sophistcated models, the goal is to account for as much of the target data variance as possible.This is quantified by the normalized quantities

NMSE = MSE/MSE00
and
NMSEa = MSEa/MSE00a

with the ultimate design goal of 0 and a 100% representation of the target data variance.

Statisticians use the R-squared quantities (http://en.wikipedia.org/wiki/Coefficient_of_determination)

R2 = 1 - NMSE
R2a = 1 - NMSEa

with the ultimate design goal of 1 and a 100% representation of the target data variance.

I use the more practical design goal of R2a = 0.99 and a 99% unbiased representation of the unbiased target data variance. This yields

===========================================
= =
= MSEgoal = 0.01*Ndof*MSE00a/Neq % Proof for reader =
= =
===========================================

For a feedforward NEURAL NET model with I=H-O node topology,

Nw = (I+1)*H+(H+1)*O

net.trainParam.goal = MSEgoal;

Hope this helps.

Greg

P.S. The DOF corrections are only for the training data used to directly estimate
the weights. The corrections are not needed for nontraining validation and test data.