Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



Re: Degree of freedom in Neural Networks
Posted:
Nov 27, 2013 11:40 PM


Thread Subject: Degree of freedom in Neural Networks From: Florian Date: 23 Nov, 2013 12:09:10 Hi Neural Network people, I have a question about some posts on the Newsreader concerning the degree of freedom for adjusting the weights. The following conditions for adjusting the weights properly is made: Neq > Nw with
Neq = N*O = N % Number of training equations Nw = (I+1)*H + (H+1)*O % Number of unknown weights to estimate Nw in this case applies to IHO structure. Does the rule Neq>Nw also apply to structures with more hidden Layers? I am not asking about the rule for calculating nw for a multilayered network. Is it possible for one training equation to adjust weigths properly in different layers or do we need a single training equation for each weight in the network like in single layer structure? (or cascaded structure weights,...) I am asking because I only have a very small number of training equations but a large number of Input features so perhaps I can use more layers to use the small number of training equation more efficient ============================================= Consider two sets of equations for weights wi (i=1:2) where the A and b coefficients are the result of noisy measurements that may also contain interference and measurement error,
I. Neq < Nw
A11*w1+A12*w2 = b1
II. Neq > Nw
A11*w1+A12*w2 = b1 A21*w1+A22*w2 = b2 A31*w1+A32*w2 = b3
In general, system I has an infinite number of solutions and, considering the uncertainty in {A,b} , solutions to I will not be robust.
On the other hand, system II has no solutions in general. However, there are approximate solutions that minimize A*wb^2 that tend to be more robust as the number of degrees of freedom Ndof = NeqNw increases.
For real world problems, the second scenario is preferable. However, if the condition Neq >> Nw is not satisfied, there are additional measures that can be taken to insure a practical robust solution. In the neural net scenario, four such measures are,
1. Reducing the number of weights by reducing the number of inputs. 2. Reducing the number of weights by pruning hidden nodes 3. Improving the generalization of iterative solutions via validation stopping 4. Improving the generalization of iterative solutions via regularization.
Contrary to your statement, there is not one equation for every weight (Ntrneq = Ntrn*O).
I have listed four remedies to your problem.
Any more questions?
Hope this helps.
Greg



