I'm developing some software using Numerical Recipes for C for some of the ancillary stuff. The application is training neural networks, a.k.a. "backpropogation".
I have a couple of questions about the Davidon/Fletcher/Powell multivariate function minimizer, which is what I am using to find local minima of the error-score surface.
I have had two problems with the NRC implementation "dfpmin". I have been able to kluge around them, but I wonder if there is a better way.
1) The DFP routine dfpmin() keeps an approximation to the inverse Hessian of the function being minimized. It is supposed to keep that approximation symmetric and positive definite. It so happens that I need the determinant of the square root of the inverse Hessian for another purpose, so I modified the code so that on exit from the routine it calculates that determinant. The main body of the code was not changed. I find when I use the NRC Cholesky Decomposition routine cholsl() to find the square root of the inverse Hessian, it often fails, indicating (so the book says) that the matrix is not positive definite. Hmmm... Is that typical behavior for the DFP routine? -- to botch the Hessian approximation on unfriendly input? It turns out that the failure doesn't hurt me because almost always it fails on a solution to the neural net problem that is a very bad "local minimum". In a way that's good. But I just wonder if it is supposed to be like that.
2) The DFP routine uses a fast approximate line-search they call "lnsrch" that doesn't necessarily converge to the perfect answer, but instead is supposed to get "close enough" for the DFP algorthm to continue converging. It assumes the line being searched is approximately cubic, and goes from there. The problem I had with it was that it was prone to failure, It would write out "Roundoff problem in lnsrch," and abort my program. Rude. The documentation says some "difficult" problems may require double precision, but I routinely convert all these routines to double precision. The workaround I used was to trap the condition that causes the failure, and in that case call an exact line search routine that is more robust. I corrected this problem before I encountered the one described above, so we can conjecture that it may have been failing on the same input that caused the hessian to go bad.