The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » Software » comp.soft-sys.matlab

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Degree of freedom in Neural Networks
Replies: 3   Last Post: Dec 12, 2013 5:40 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Greg Heath

Posts: 6,387
Registered: 12/7/04
Re: Degree of freedom in Neural Networks
Posted: Nov 27, 2013 11:40 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Thread Subject: Degree of freedom in Neural Networks
From: Florian Date: 23 Nov, 2013 12:09:10
Hi Neural Network people,
I have a question about some posts on the Newsreader concerning the
degree of freedom for adjusting the weights. The following conditions for
adjusting the weights properly is made: Neq > Nw with

Neq = N*O = N % Number of training equations
Nw = (I+1)*H + (H+1)*O % Number of unknown weights to estimate

Nw in this case applies to I-H-O structure. Does the rule Neq>Nw also
apply to structures with more hidden Layers? I am not asking about the
rule for calculating nw for a multilayered network. Is it possible for
one training equation to adjust weigths properly in different layers or do
we need a single training equation for each weight in the network like in
single layer structure? (or cascaded structure weights,...) I am asking
because I only have a very small number of training equations but a large
number of Input features so perhaps I can use more layers to use the small
number of training equation more efficient
Consider two sets of equations for weights wi (i=1:2) where the A and b
coefficients are the result of noisy measurements that may also contain
interference and measurement error,

I. Neq < Nw

A11*w1+A12*w2 = b1

II. Neq > Nw

A11*w1+A12*w2 = b1
A21*w1+A22*w2 = b2
A31*w1+A32*w2 = b3

In general, system I has an infinite number of solutions and, considering
the uncertainty in {A,b} , solutions to I will not be robust.

On the other hand, system II has no solutions in general. However, there
are approximate solutions that minimize ||A*w-b||^2 that tend to be more robust
as the number of degrees of freedom Ndof = Neq-Nw increases.

For real world problems, the second scenario is preferable. However, if the
condition Neq >> Nw is not satisfied, there are additional measures that can be
taken to insure a practical robust solution. In the neural net scenario, four
such measures are,

1. Reducing the number of weights by reducing the number of inputs.
2. Reducing the number of weights by pruning hidden nodes
3. Improving the generalization of iterative solutions via validation stopping
4. Improving the generalization of iterative solutions via regularization.

Contrary to your statement, there is not one equation for every weight
(Ntrneq = Ntrn*O).

I have listed four remedies to your problem.

Any more questions?

Hope this helps.


Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.