Subject: Matlab trainbr "converges" to trivial solution Sent: Apr 7, 2013 10:04:40 AM
>See attached screen shot. Re-initializing helps sometimes, but why does >this happen in the first place
It's just a combination of statistics and mountainous weight space. If you begin with random initial weights, and a parsimonious number of hidden nodes, H, there is no guarantee that a single run of steepest descent w/wo momentum will lead to a low local minimum, much less a global minimum. I routinely run 10 random weight initializations for each value of H that I try. Even when H is optimal, some of the solutions do not converge to a low local min.
When H is larger than necessary, validation stopping and/or regularization can be used to prevent overtraining the overfit net and the corresponding lack of ability to perform well on nontraining data. Nevertheless, since the initial weights are random, there is still no guarantee that steepest descent will lead to a low local min.
There are more exotic minimization algorithms than steepest descent. However, they are much slower and are still not guaranteed to find a low local min. It is more practical to
either 1. Design many nets with steepest descent and choose the one that minimizes the validation set error. or 2. Design many nets with regularized descent and choose the one that minimizes the training set errror.
It is very doubtful that one mimimization run will always be successful.