Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



Re: ga optimization of nn weights
Posted:
Feb 6, 2013 3:03 PM


"Greg Heath" <heath@alumni.brown.edu> wrote in message <keu6p0$nij$1@newscl01ah.mathworks.com>... > "Greg Heath" <heath@alumni.brown.edu> wrote in message <keu1no$3ru$1@newscl01ah.mathworks.com>... > > % Subject: ga optimization of nn weights > > % From: Syed Umar Amin <syed.umar.amin@gmail.com> > > % Sent: Feb 4, 2013 11:17:22 AM > > > > getting error Index exceeds matrix dimensions. > > pls help > > I don't have the GA Toolbox. However, this may help > > close all, clear all, clc; > load cancer_dataset > whos > % Name Size Bytes Class Attributes > % > % cancerInputs 9x699 50328 double > % cancerTargets 2x699 11184 double > > Inputs = cancerInputs; > Targets = cancerTargets; > [ I N ] = size(Inputs) % [ 9 699 ] > [O N ] = size(Targets) % [ 2 699 ] > minmaxin = minmax(Inputs(:)') % [ 0.1 1 ] > minmaxtar = minmax(Targets(:)') % [ 0 1 ] > > % The 2 outputs are not independent since they sum to 1. If this > % constraint is enforced during training, by using only one output (e.g, > % using 'logsig' ), the number of independent training equations is > % N*O/2 . If the constraint is not enforced, it imay be N*O (I think!). > > targets = Targets(1,:); > [ O N ] = size(targets) > Neq = N*O % 699 No. of independent equations > [ inputs, muin, stdin ] = zscore(Inputs')'; % Better for 'tansig' hidden activation > > % For a feedforward MLP with an IHO (InputHiddenOutput) > % node topology the number of unknown weights is > % Nw = (I+1)*H+(H+1)*O > % Hidden node upper bound (Neq > Nw) > > Hub = 1+ceil((NeqO)/(I+O+1)) % 63 > > % To mitigate noise and measurement error, try to choose H > % so that Neq >> Nw > > H = 10 % MATLAB default (Also try smaller values) > net = patternnet(H); % For classification > net.layers{2}.transferFcn = 'logsig'; % For classification
rng(0) % Initialize the RNG so the run can be duplicated
> net = configure(net, inputs, targets); > h = @(x) mse_test(x, net, inputs, targets); > ga_opts = gaoptimset('TolFun', 1e2,'display','iter'); > [x_ga_opt, err_ga] = ga(h, Nw, ga_opts); > > function mse_calc = mse_test(x, net, inputs, targets) > % 'x' contains the weights and biases vector > % in row vector form as passed to it by the > % genetic algorithm. This must be transposed > % when being set as the weights and biases > % vector for the network. > % > % To set the weights and biases vector to the > % one given as input > net = setwb(net, x); > % To evaluate the ouputs based on the given > % weights and biases vector > y = net(inputs); > % Calculating the mean squared error > % >mse_calc = sum((ytargets).^2)/length(y); > > % Better to normalize by the average target variance > > mse_calc = sum((ytargets).^2)/mean(var(targets',1)); > end > > Since there is no overfitting mitigation using a regularization goal > > help msereg > doc msereg > > or validation stopping (using training set to minimize goal but > stopping when validation set error is minimized) > > then you must try to minimize H provided validation errors are > acceptable. > > Given the optimal value for H, you still should mitigate configure's > random choice of initial weights by designing 10 or more candidate > nets. Then choose the best candidate. > > Then and only then predict the generalization error on unseen > data by using the test set (to be completely unbiased, it is only used > ONCE. If you want to get more candidates, you should repartition the > data trn/val/tst/ and design new nets).
Perhaps designing many more candidates before using the test set ONCE would be a smarter choice. > You may wish to obtain summary statistics of the MSE using the > 10 or more weight intialization trials for Hopt. Hope this helps. Greg



