The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » Software » comp.soft-sys.matlab

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Problem with 1-step ahead prediction in neural network
Replies: 9   Last Post: Oct 23, 2013 6:23 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Greg Heath

Posts: 6,387
Registered: 12/7/04
Re: Problem with 1-step ahead prediction in neural network
Posted: Oct 19, 2013 7:25 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

"phuong" wrote in message <l3uis0$mut$>...
> "Greg Heath" <> wrote in message <l3t37j$bkp$>...
> > "phuong" wrote in message <l3s2ha$4qv$>...
> > > Hi everybody,
> > > I having a trouble with 1-step ahead of neural.
> > > When I train network with fix parameter, I received another weight (IW,LW,b).
> > > I know the reason is random intial weights. But why can we believe the predict result in 1-step if it alway changes for every train. May be the network not convergence. Because when it convergence, we just have only solution( or approximate solution). So is the network convergence?
> > > All of things make the test result for 100 new predicted by neural network have many results, and some times different between so large.
> > > Please help me fix these problems.
> > > Thank you very much.
> > > Phuong

> >
> > The only problem is your assumption that there is only one solution. For any I-H-O network configuration with tansig hidden nodes there are (2^H)*H!-1 other nets that are equivalent. For the default value of H=10, there are (2^10)*factorial(10) = 3,715,891,200
> > equivalent nets.
> > 1. There are H! equivalent nets that only differ by the way they are ordered.
> > 2. Since tansig is an odd function, for each of those orderings there are two equivalent
> > nets that only differ by the polarity of the weights connected to one of the H hidden nodes.
> >
> > To make things worse, there can be local minima that are not global minima. The corresponding solutions range from excellent to very poor. Finally, there are other reasons
> > (e.g., maximum mu in trainlm) that minimization searches fail.
> >
> > That is why I now use Ntrials = max(10,30/Ntst) random weight initializations for each candidate value of H.
> >
> > Hope this helps.
> >
> > Greg

> Sorry, I don't understand your way. As i understand, you will train network Ntrials times, ok? And what is the next, compute the mean of result or what? Please help me more detail.
> One more, i agree we have see H! net but i think the weight set is the same just change the oders, right? and If this idea right, i think the mse not change.

There are several main points

1. No I-H-O net with H > 1 tansig hidden units is unique. Any such I-H-O net will have the same input-output function as, at least, the other 2^H*(H!-1) equivalent nets.
2. Therefore, given a set of design data, there is no set of weights that is "the" optimum solution.
3. Given a trial value for H and a random set of initial weights, there is no guarantee that the subsequent training will optimize the training objective function. Even if H is acceptable, the training may converge to a local non-global minimum resulting in a range of results from excellent to very poor. In addition, the training may be aborted because the maximum mu limit or maximum epoch is reached.
4. Therefore, to have a high probability of obtaining an acceptable solution, my recommendation is to design Ntrials nets for each candidate value of H.
5. What you do with the Ntrials*numH resulting designs depends on your personal
a. The acceptable solution (e.g., R2a >= 0.99) with the smallest H?
b. The 10 best solutions to combine in an ensemble or committee net ?
c. Statistical characterization (e.g., min, med, mean, stdv, max ) of performance
estimates on unseen data ?

Bottom line: There is absolutely no reason why you should expect acceptable designs to
have similar final weight distributions.



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.